Open JTFouquier opened 8 years ago
An alternative is to load datasets from ArrayExpress (they imports dataset from GEO (plus other resources) and did the manual curations).
Related code examples:
http://pythonhosted.org/bioservices/references.html#bioservices.arrayexpress.ArrayExpress
Original comment by: Chunlei Wu
Ref: http://www.ebi.ac.uk/arrayexpress/help/programmatic_access.html
To get a list of Experiments:
by array type id http://www.ebi.ac.uk/arrayexpress/json/v2/files?array=A-AFFY-33
full list of array types can be obtained here: ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/array/
For a given Experiment (using the id returned from above queries):
get the list of files associated with this Experiment http://www.ebi.ac.uk/arrayexpress/json/v2/files/E-MEXP-31
*.sdrf.txt file is the sample description file
*.processed.*.zip file is the processed data file.
Original comment by: Chunlei Wu
ArrayExpress experiment loaded from GEO has the ID pattern like this one:
E-GEOD-32474 <--> GSE32474
Original comment by: Chunlei Wu
When making web-service calls, consider using a library like httplib2, with the support of local caching (so that avoid hitting web services too much during the development).
Original comment by: Chunlei Wu
We need to build a tool to load all GEO GDS datasets (GDS datasets contain curated metadata) It should support incremental update.
Useful resources: