SuLab / biogps_core

http://biogps.org/
GNU General Public License v3.0
0 stars 0 forks source link

Load all GEO GDS datasets #37

Open JTFouquier opened 8 years ago

JTFouquier commented 8 years ago

We need to build a tool to load all GEO GDS datasets (GDS datasets contain curated metadata) It should support incremental update.

Useful resources:


JTFouquier commented 8 years ago

An alternative is to load datasets from ArrayExpress (they imports dataset from GEO (plus other resources) and did the manual curations).

Related code examples:

http://pythonhosted.org/bioservices/references.html#bioservices.arrayexpress.ArrayExpress


Original comment by: Chunlei Wu

JTFouquier commented 8 years ago

Ref: http://www.ebi.ac.uk/arrayexpress/help/programmatic_access.html

To get a list of Experiments:

For a given Experiment (using the id returned from above queries):


Original comment by: Chunlei Wu

JTFouquier commented 8 years ago

ArrayExpress experiment loaded from GEO has the ID pattern like this one:

E-GEOD-32474 <--> GSE32474


Original comment by: Chunlei Wu

JTFouquier commented 8 years ago

When making web-service calls, consider using a library like httplib2, with the support of local caching (so that avoid hitting web services too much during the development).


Original comment by: Chunlei Wu