seandavi / SRAdb

Git mirror of Bioconductor SRAdb package
21 stars 3 forks source link

pathogen host data #29

Closed noahaus closed 4 years ago

noahaus commented 4 years ago

Greetings,

I'm trying to develop a script to extract host metadata from bacterial pathogens. This information would be available in the SRA run selector, but the output I get from SRAdb does not include this information. Is there any way to access SRA run selector data based on the search word I provide?

any help would be greatly appreciated!

Noah A. Legall

seandavi commented 4 years ago

Can you give an example of what you are trying to find?

noahaus commented 4 years ago

Certainly!

https://www.ncbi.nlm.nih.gov/Traces/study/?WebEnv=NCID_1_12587053_130.14.22.76_5555_1580132111_3307506105_0MetA0_S_HStore&query_key=2

this above link will take you to the SRA run selector which aggregates all the metadata into one list.

SRAdb provides metadata fields, but not quite the same metadata fields (such as host information for pathogen sequences) I was curious if there was any way I could extract this information using SRAdb.

seandavi commented 4 years ago

Due to an accumulation of changes and the growing size of SRAdb, we have moved over to a more flexible system that includes an open API for search and retrieval. The project, OmicIDX, is still a work in progress, but you may find the sketchy docs useful:

https://omicidx.github.io/omicidx-docs https://omicidx.github.io/omicidx-docs/docs/open-web-api/rest-api/

The entire dataset is also available in Google Bigquery for SQL access (requires google cloud project). An example of the output for a single "run" looks like:

https://api.omicidx.cancerdatasci.org/sra/runs/SRR5196685

I'm going to close this for now. Feel free to take a look at https://github.com/omicidx and submit issues for bugs or feature requests.

seandavi commented 4 years ago

One more thing, if you can provide the search term(s) you are using to generate your Entrez result set, I can try to bootstrap on that to generate an example query via OmicIDX.

noahaus commented 4 years ago

Certainly. I searched "Mycobacterium bovis"

seandavi commented 4 years ago

@noahaus, I just posted an issue over on omicidx/omicidx-api#21 with a quick pass at a solution to getting the full metadata for your organism.