Open mestato opened 6 years ago
https://phytozome.jgi.doe.gov/phytomine/template.do?name=Gene_Expression
Seelcting p. trichocarpa, this gives me 992,040 rows.
organized by: Gene, abundance, experiment name, exerpiment group.
I think that experiment name will map to biomaterials, experiment group will map to analysis for the expression module.
Potri.001G000100 0.0 BESC423.ZL 7 female early GeneAtlas Tissue Sample
Potri.001G000100 0.0 BESC443.ZG 43 female receptive GeneAtlas Tissue Sample
Potri.001G000100 0.0 BESC842.ZI 22 female late GeneAtlas Tissue Sample
I can rebuild this data quite easily into matrix format for loading.
Maybe @jwest60 would be interested? This could be a good excuse to practice python.
Data is available at /staton/projects/populus_trichocarpa_expression
[x] Download data from phytozome
[ ] Generate matrix format expresision file from current file (good joe task?)
[ ] Locate and import the biomaterial information
[ ] Import expression data
Getting the biomaterials info:
The expression list has this entry for the data in the table
The individual pages for the tissues are not helpful ie Experiment Name: | BESC423.ZL 7 female early | Experiment Group: | GeneAtlas Tissue Sample
Looking for these tissue names, I can find them ref'd in some pubs or in some static content ie:
Note also that some tissue names are much less informative: ie stem-urea.
There are few enough samples that we could manually create a biomaterial matching each, and add whatever properties we can infer from the name (tissue type, treatment...)
Populus trichocarpa has expression data available through JGI Phytozome. I can't find it in the browser or gene pages, but the expression values are available in the phytomine interface (https://phytozome.jgi.doe.gov/phytomine/report.do?id=50127750&trail=|50127750)
It may be possible to pull the normalized expression values in bulk through this interface. For metadata and/or if remapping/quantification is needed, the raw data is in NCBI. It seems that each sample is its own bioproject: PRJNA372410-PRJNA372412,PRJNA402533-PRJNA402568