ewels / sra-explorer

Web application to explore the Sequence Read Archive.
https://sra-explorer.info/
GNU General Public License v2.0
203 stars 29 forks source link

Search by GEO series ID does not seem to work #34

Closed rioualen closed 3 years ago

rioualen commented 3 years ago

Hello,

Thanks a lot for a very useful tool! However I've been having trouble finding datasets using their GEO series identifier. For example:

GSE113562 doesn't yield any result

PRJNA451515 (BioProject ID) works GSM3109110 (sample from the above series) works

I noticed the same behaviour for a variety of datasets, is this a bug?

Would there be a way to get the full metadata in TSV format for a list of GSM (or GSE) IDs programmatically rather than using the webpage?

Best

Claire

ewels commented 3 years ago

It's because this accession isn't listed in the SRA: https://www.ncbi.nlm.nih.gov/sra/?term=GSE113562

sra-explorer just uses the main SRA search, so if it can't be found there then it won't show up. Nothing I can really do about this, sorry.

Phil

ewels commented 3 years ago

Sorry, missed a bit:

Would there be a way to get the full metadata in TSV format for a list of GSM (or GSE) IDs programmatically rather than using the webpage?

Absolutely! This is all that sra-explorer is doing :) Check out the SRA API that is used here: https://www.ncbi.nlm.nih.gov/home/develop/api/

There are also a number of more user friendly tools that people have built for this. For example, the relatively recent ffq: https://github.com/pachterlab/ffq