UBC-MDS / group29

Project Repo for Group 29 for DSCI 522

MIT License

0 stars 9 forks source link

milestone 4 #77

Closed sukh2929 closed 3 years ago

sukh2929 commented 3 years ago

1.create dependency diagram - should be in the project directory 2.create docker file - should be in the project directory 3.create docker image 4.update readme to add the link of dependency diagram 5.update readme to use your project with and without Docker

rachelywong commented 3 years ago

According to TA and peer feedback this is what we need to change for script5:

- explain why we used balanced log regression (explain when we talk about the model scores)
- remove scatterplot matrix?
- explain sampling (why we chose to use 1000)
- be more explicit about which features we ended up keeping and dropping
- fix conclusion about model scores and overall conclusion (especially part about logistic regression balanced performing average)
- talk more about figure 1 in results and discussion
- add why fit and score times, accuracy, and f1 scores were chosen as evaluation metrics
- grammar check

jraza19 commented 3 years ago

edits to the read me after making the dockerfile include:

add random as a dependency
change the project proposal last line
add the new docker code to the readme file
create new dependency diagram

rachelywong commented 3 years ago

add sentence about certain sample we pick with random.seed and how data could be missing

jraza19 commented 3 years ago

edit to eda2.py to address TA feedback:

Split the data into training (0.8) and testing (0.2)

diabetes_with_race = pd.read_csv(datafile[1]) train_df, test_df = train_test_split(diabetes_with_race, test_size=0.2, random_state=123) diabetes_subset = train_df