Closed dlebauer closed 4 years ago
We will use these from the BAP population: http://datacommons.cyverse.org/browse/iplant/home/shared/terraref/genomics/derived_data/bap/resequencing/danforth_center/version1/gvcf
There is one file per genotype
This is the combined file: http://datacommons.cyverse.org/browse/iplant/home/shared/terraref/genomics/derived_data/bap/resequencing/danforth_center/version1/hapmap
Here's an example of what I'm imagining the data will look
Plot | Soil Moisture Day 1 Max | Soil Moisture Day 1 Min | Soil Moisture Day 1 Mean | etc for all params and all days | Emergence Date | all phenotypes | growing degree days at emergence | Anything else you think is interesting |
---|---|---|---|---|---|---|---|---|
Plot 1 | 4.2 | 2.3 | 3.2 | etc | 20 | etc | 10 | etc |
I'm thinking we should be using dates that just count from time zero at planting and not actual calendar dates.
@diatomsRcool so to confirm, you are looking for a table with one row per plot? I expect these will end up 100s of columns wide if we have min/mean/max for each environmental parameter x day.
Other than soil moisture, each column of environmental data will have the same value repeated.
Yes. @remcochang and @rossarun can override me.
@diatomsRcool @remcochang @rossarun
Works in Progress:
.csv
format within a .zip
file. You can comment on the spreadsheets and gists directly, but it would be best to post questions and comments here so that we can all see them. Thank you!
First iteration of ML training data can be found in the Google Drive for now. Closing this issue and creating new ticket(s) for second iteration and including genomics data.
These are the data that we want to start with:
As an idea, there are some examples curated for another project here: https://terraref.ncsa.illinois.edu/d3m/ but ... the shape of the data will be different, and we also need to work on how to properly curate / annotate these (#2).
Genomics data
Need to get feedback from Pankaj on what genomics data to include; hopefully can use data from http://datacommons.cyverse.org/browse/iplant/home/shared/terraref
Phenotypes
Environment