IndexThePlanet / Logan

Logan Unitigs and Contigs
92 stars 3 forks source link

Fetching transcripts from rna seq sra dataset #10

Open cakeinspace opened 1 week ago

cakeinspace commented 1 week ago

Hello thanks for the wonderful work. I am interested in downloading all transcripts for a given gene from the rna seq datasets available in sra.

Is there a way to specify the metadata such as species name etc and the gene of interest and fetch the transcripts along with the associated metadata.

From reading the tutorials. One way i could figure out was to simply get all accession ids for rna seq in sra and then provide the gene of interest as a fasta and run the tutorial as in chicken.md.

I just wanted to know if theres a smarter way or am i completely off

Thanks in advance

rchikhi commented 13 hours ago

Hi,

your best bet is to familiarize yourself with https://www.ncbi.nlm.nih.gov/sra/docs/sra-cloud-based-examples/ to grab a list of datasets which are for sure rna-seqs from the species that interests you.

If you need assistance with this, please post here.

best, Rayan

rchikhi commented 13 hours ago

Ah, to answer your question, there is currently no way to specify gene of interest, we didn't annotate the Logan assembled sequences/transcripts one by one nor did we call genes on them (yet)