UBC-MDS / group29

Project Repo for Group 29 for DSCI 522
MIT License
0 stars 9 forks source link

milestone 4 #77

Closed sukh2929 closed 3 years ago

sukh2929 commented 3 years ago

1.create dependency diagram - should be in the project directory 2.create docker file - should be in the project directory 3.create docker image 4.update readme to add the link of dependency diagram 5.update readme to use your project with and without Docker

rachelywong commented 3 years ago

According to TA and peer feedback this is what we need to change for script5:

    • explain why we used balanced log regression (explain when we talk about the model scores)
    • remove scatterplot matrix?
    • explain sampling (why we chose to use 1000)
    • be more explicit about which features we ended up keeping and dropping
    • fix conclusion about model scores and overall conclusion (especially part about logistic regression balanced performing average)
    • talk more about figure 1 in results and discussion
    • add why fit and score times, accuracy, and f1 scores were chosen as evaluation metrics
    • grammar check
jraza19 commented 3 years ago

edits to the read me after making the dockerfile include:

rachelywong commented 3 years ago

add sentence about certain sample we pick with random.seed and how data could be missing

jraza19 commented 3 years ago

edit to eda2.py to address TA feedback:

Split the data into training (0.8) and testing (0.2)

diabetes_with_race = pd.read_csv(datafile[1]) train_df, test_df = train_test_split(diabetes_with_race, test_size=0.2, random_state=123) diabetes_subset = train_df