biothings / mygene.info

MyGene.info: A BioThings API for gene annotations
http://mygene.info
Other
113 stars 20 forks source link

Integrate Ensembl Plant #38

Closed sirloon closed 4 years ago

sirloon commented 6 years ago

Based on existing dumper & parsers (https://github.com/biothings/mygene.info/tree/master/src/hub/dataload/sources/ensembl) integrate Ensembl plant data in mygene. This BioMart should be used, on the same principle as the one we use for human, rat, ... https://plants.ensembl.org/biomart/martview

We'll start with Arabidopsis Thaliana, but will then continue with other plants.

sirloon commented 5 years ago

1) the dumper. this can’t be automated using a plugin. this has to be done by writing a dumper manually. Data is fetched from this service: https://plants.ensembl.org/biomart/martview From there a query is designed: datasets, fields, filter, and then the XML button is used to actually see the corresponding XML request, which can then be found (as a template) in our current ensembl dumper. https://github.com/biothings/mygene.info/blob/master/src/hub/dataload/sources/ensembl/dump.py.

2) the parser. depending on the XML requests we need, it's possible that parsers could be the same as existing ones (refactoring needed)

namespacestd0 commented 4 years ago

Implemented here: https://github.com/biothings/mygene.info/tree/master/src/hub/dataload/sources/ensembl_plant