genophenoenvo / terraref-datasets

Repository for code and small datasets derived from the TERRA REF program
MIT License
0 stars 3 forks source link

Prepare spreadsheet for Jaiswal group to annotate TERRA REF data with TO terms #9

Closed dlebauer closed 4 years ago

dlebauer commented 4 years ago

From kickoff notes:

THIS IS CRITICAL to the ability to cross reference the data and use them in an integrated way that informs. Important considerations need to be made for also contributing new terms to TO, given the complexity and wealth of data from TERRA REF.

@diatomsRcool and Pankaj where to start? How can we help?

dlebauer commented 4 years ago

@MagicMilly please create a table with all of the combinations of variables.name and methods.name in the TERRA REF dataset that you are preparing; then we can discuss with Pankaj et al about how to fill it out.

MagicMilly commented 4 years ago

@dlebauer @diatomsRcool Here is the work-in-progress spreadsheet. Anybody should be able to comment - please let me know if you have trouble accessing or commenting.

MagicMilly commented 4 years ago

First draft of TO table. Still need to fill in some missing values once I find more information. Four traits have ontology identifiers in the dataset, which I included in the table:

MagicMilly commented 4 years ago

Example code used to find data for spreadsheet:

unique_traits = df.trait.unique()
for trait in unique_traits:

    print(f'All unique units for {trait}: {df.loc[df.trait == trait].units.unique()}')

All code used can be found in this notebook

MagicMilly commented 4 years ago

TO table ready for review ahead of meeting on Friday. Will share link in group's Slack channel.

Missing values include

diatomsRcool commented 4 years ago

I sent the link to Laurel Cooper on Pankaj's team.

dlebauer commented 4 years ago

looks great ... next step is to create a lookup table https://github.com/genophenoenvo/terraref-datasets/issues/21