Closed MagicMilly closed 4 years ago
@MagicMilly could you please clarify what the wide derived datasets are? Are these the metrics calculated in your notebooks like days to flowering, flag leaf emergence, gdd, etc?
Yes, any of the single traits like the ones you listed are wide, with one row per plot. Canopy height is a bit different, but when I originally queried in R using the traits
package, it was tall format with all traits.
@MagicMilly Thanks for the clarification!!
here is the code to take the tall format trait_data.zip from dryad and put into a single tall file
Tall format tables will be uploaded to this Google Drive folder for feedback until they're ready to be shared on CyVerse. Please let me know if you cannot access.
@rbartelme Just an update: all of the raw, tall datasets for four seasons (including MAC 4 and 6 with the additional info like lat and lon) are now in the same folder that I shared above.
days_to_flowering
as one of the traits. This is when I'll change the datetime objects and populate the other columns where possible. @rbartelme The raw tall formats in the tall_format_data
folder have been updated with
days_to_
traitsI matched the new data with the original where I could, but I did not modify any of the date values. Hopefully that can be more easily done in R, though it may not be needed for this particular trait.
The code I used can be found on github in the create_tall_formats
notebook. Let me know if you notice any errors or have any other feedback.
As requested by Ryan, retain tall format from traits data queried from betydb, but apply same cleaning functions as the wide derived datasets. Will be easier for him to work with using R for machine learning.