WayScience / phenotypic_profiling

Machine learning for predicting 15 single-cell phenotypes from cell morphology profiles
Creative Commons Attribution 4.0 International
1 stars 3 forks source link

Refactor download module #17

Closed roshankern closed 1 year ago

roshankern commented 1 year ago

This PR is ready for review!

This is the first of many PRs to restructure this repo to use CP/merged features in addition to DP features.

Currently, the download module combines two mitocheck datasets (2006 and 2015). However, after consideration, Greg and I decided that only the newer dataset is needed (at least for now). Thus, the biggest change in this PR is simply downloading the latest dataset instead of downloading the later and early datasets and then merging the two.

The second very small change that this PR implements is changing the name of the downloaded data to labeled_data instead of training_data. This is to make it clear that the downloaded data is not the training dataset (downloaded data gets split into training, testing, and maybe holdout in a future revision of this repo).