FireLabSoftware / ScanRabbit

ScanRabbit-- Assembly-indepdendent filtering of NGS Datasets for a well defined protein motif
MIT License
0 stars 0 forks source link

anticodon search in rnaseq #1

Open igortru opened 3 weeks ago

igortru commented 3 weeks ago

just curious, is it possible implement something similar for trna search? my interest : “missing” trna anticodons. see https://pmc.ncbi.nlm.nih.gov/articles/PMC8007984/

FireLabSoftware commented 3 weeks ago

This is a really nice idea but would certainly take some knowledge of tRNA scanning to implement.

For background--- Scan Rabbit is not a particularly sophisticated tool and doesn't have a particularly sophisticated search and evaluation routine for protein coding homolgy (just looking at coherence with a supplied multiple sequence alignment in a defined region). There are much more sophisticated programs such as Hmmer (Sean Eddy)-- but we built our own to enable the use of simple python tools to allow processing (and in some cases rejection) for a broad set of input files to be searched and summarized in a single output document.

From what I have heard at many Bay Area RNA club meetings, searching for tRNAs is a much more arduous task that searching for proteins that match a HMM model or MSA. If one had nice code for the latter, though, the flexibility of python should allow that code to be dropped into ScanRabbit to take advantage of ScanRabbit's ability to deal with large numbers of sometimes-imperfect input SRA files. The folks who have done things like this are Sean and Todd Lowe (UCSC)-- they would likely have some good ideas and might already have built this.

Best Regards, Andy Fire (aka the Scan Rabbit Rabbit Scanner)

On Oct 30, 2024, at 11:34, igortru @.***> wrote:

just curious, is it possible implement something similar for trna search? my interest : “missing” trna anticodons. see https://pmc.ncbi.nlm.nih.gov/articles/PMC8007984/

— Reply to this email directly, view it on GitHubhttps://github.com/FireLabSoftware/ScanRabbit/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACOED6JDJ4MBQQWPIFBD2C3Z6ERD3AVCNFSM6AAAAABQ4X6ORWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZDIOJYHE4TIOI. You are receiving this because you are subscribed to this thread.Message ID: @.***>

igortru commented 3 weeks ago

what if, collect all 21na trna kmers from all prokaryotic genomes in genbank centered on anticodon, select only unique kmers with non-canonical anticodons, translate them into aa, create alignment,motif, pssm, etc and use scanrabbit as is?

if 21na is not enough , it can be increased to 33na,for example.

FireLabSoftware commented 2 weeks ago

Not sure what would happen with this... no reason it wouldn't have some power to find things but also it seems that Todd Lowe (and others) should have better algorithms that would be based on their studies of tRNA structure. I should also see Todd next month at our regiojal RNA meeting and can ask him whether things like this have already been make to happen. All best, andy

On Oct 31, 2024, at 13:38, igortru @.***> wrote:

what if, collect all 21na trna kmers from all prokaryotic genomes in genbank centered on anticodon, select only unique kmers with non-canonical anticodons, translate them into aa, create alignment,motif, pssm, etc and use scanrabbit as is?

if 21na is not enough , it can be increased to 33na,for example.

— Reply to this email directly, view it on GitHubhttps://github.com/FireLabSoftware/ScanRabbit/issues/1#issuecomment-2450773461, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACOED6ORB4NMXIJBTNS6IRLZ6KIL3AVCNFSM6AAAAABQ4X6ORWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJQG43TGNBWGE. You are receiving this because you commented.Message ID: @.***>

igortru commented 2 weeks ago

Dear Andy!

stats from gtrnadb (LoweLab)

https://gtrnadb.ucsc.edu/search.html

Serine (Ser) AGA GGA CGA TGA ACT GCT

TGA - 9659 trna ACT - 632 trnas, no Prokaryotic, on first glance I see only Eukaryotic organisms.

I am sure in genbank , out of complete prokaryotic genomes , can be found tens of thousands trnas with ACT anticodon

I think, gtrnadb is the best place where all trnas from genbank can be stored. in addition it can provide possibility retrieve them together with genomic context (aka ipg report for proteins) example: https://www.ncbi.nlm.nih.gov/ipg/?term=WP_000100000

Eukaryota 609 178,889 Archaea 220 10,476 Bacteria 4,038 242,068 (genbank has millions of non-instantiated trna features annotated with trnascan-se) please, ask Todd what he think about it, interesting to know.