dotimplement / HealthChain

Simplify testing and validating AI and NLP applications in a healthcare context 💫 🏥
https://dotimplement.github.io/HealthChain/
Apache License 2.0
18 stars 15 forks source link

Adding Random Seed to Data Generators #20

Open adamkells opened 5 months ago

adamkells commented 5 months ago

Description

Add a random seed to the DataGenerator module.

Context

There are a number of places in the code where numpy is used to randomly generate data. To ensure reproducibility, the user should be able to set a seed and have it flow through the code to be used by all generators.

Possible Implementation

This will mostly involve finding all functions containing random components and adding seed as an argument to the function and then passing it to np.random().

deevyanshoo commented 1 month ago

Hello @adamkells, I can take up this issue. Please assign this to me

adamkells commented 1 month ago

Hi @deevyanshoo thank you for taking an interest in contributing! 😊 We've decided to go the route of leaving issues unassigned so feel free to go ahead and start working on it. Let me know if anything is unclear or if you need help!

ni9999 commented 1 month ago

Hi @adamkells I've added random_seed. here #80. It seems this codebase uses Faker instead of numpy to generate data. Please review if any changes need to be made