This is the first of many PRs to restructure this repo to use CP/merged features in addition to DP features.
Currently, the download module combines two mitocheck datasets (2006 and 2015). However, after consideration, Greg and I decided that only the newer dataset is needed (at least for now). Thus, the biggest change in this PR is simply downloading the latest dataset instead of downloading the later and early datasets and then merging the two.
The second very small change that this PR implements is changing the name of the downloaded data to labeled_data instead of training_data. This is to make it clear that the downloaded data is not the training dataset (downloaded data gets split into training, testing, and maybe holdout in a future revision of this repo).
This PR is ready for review!
This is the first of many PRs to restructure this repo to use CP/merged features in addition to DP features.
Currently, the download module combines two mitocheck datasets (2006 and 2015). However, after consideration, Greg and I decided that only the newer dataset is needed (at least for now). Thus, the biggest change in this PR is simply downloading the latest dataset instead of downloading the later and early datasets and then merging the two.
The second very small change that this PR implements is changing the name of the downloaded data to
labeled_data
instead oftraining_data
. This is to make it clear that the downloaded data is not the training dataset (downloaded data gets split into training, testing, and maybe holdout in a future revision of this repo).