Closed SimplyTim closed 2 days ago
For data analysis part: @yixuangaoclara Introduction, import data, and clean data(step 1 ~ 3). Might do some parts of EDA(step 4) due to data cleaning. @BryanLee06 Perform EDA, split data and create transformer(step 4 ~ 6). @SimplyTim Create pipelines for Models and perform cv(step 7). @wzhu8410 Find the best hyperparameters, run the model on the testing set, and visualize the results(step 8 ~ 9).
I need to change the summary part of the notebook and the conda lock part of the README file once we finish the other parts.
Updated the table titles as well as minor edits to discussion. @BryanLee06 @yixuangaoclara @wzhu8410 any final changes?
submitted our Milestone 1 PDF on Gradescope
Thanks @yixuangaoclara. I'm closing this issue for Milestone 1. Great work guys!
Hey all.
Based on our meeting today, I just wanted to finalize the distribution of work for this week:
README.md
- @yixuangaoclaraCODE_OF_CONDUCT.md
- @SimplyTimCONTRIBUTING.md
- @wzhu8410data/
- @yixuangaoclaraanalysis.ipynb
- All membersenvironment.yml
- @BryanLee06For the analysis, the requirements are documented here.
Just to summarize our discussion today, the flow of this
analysis.ipynb
document will include both the explanation and code for the following steps:This is a very high level overview of the tasks. If you believe I forgot or said something incorrectly, just let me know! Also let me know which tasks will be done by whom. For instance, I wouldn't mind doing the coding for parts 6 to 8 from the list above.
Thanks! 😄
EDIT: I forgot to mention we would have to do separate branches for each task or subset of tasks. So for instance, the EDA might be one branch, the model selection might be another, the hyperparameter optimization might be another, etc.