ababaian / serratus

Ultra-deep search for novel viruses
http://serratus.io
GNU General Public License v3.0
250 stars 32 forks source link

Brief SRA metadata for a subset of SRR ids #256

Closed eul2021 closed 2 years ago

eul2021 commented 2 years ago

Hello, I would like to get a brief SRA description of the sample/origin that's shown below the SRR id in the Serratus explorer. I've selected a list of ~3000 SRR with occurrences of a certain viral family and then would like to download that list of SRR with the short description metadata. Can you please let me know how I can do it via the app or programmatically? Thank you very much!

ababaian commented 2 years ago

Heya @eul2021, Yeah maybe the simplest option is to try the SRA website and slap an OR between each SRR accession code as your search query. Like this: https://www.ncbi.nlm.nih.gov/sra/?term=ERR3569471++OR+ERR3569472 (maybe break this down into 250 accession chunks).

If that returns the query you want click on "Send To: --> File --> SraRunInfo" and you get a simple table of the results.

You can try the SRAdb package

Finally you can login to the Serratus SQL server and we have a table called srarun which contains the standard SRA meta-data for all the runs we have analyzed.

ababaian commented 2 years ago

Oh and this might be a great time for me to plug palmID, this is an experimental RdRP interface for the Serratus data which pulls SRA meta-data automatically. From the RdRP reports here you can download the meta-data directly as a CSV. It doesn't work for a whole family though, you input a single sequence and it will pull down all matches down to ~40% identity in the RdRP. This is a more conservative analysis, but has good specificity for RNA viruses which can be subsequently assembled.

eul2021 commented 2 years ago

Artem, thank you very much for a quick response and for useful suggestions! I'll try all of them:-)