hillerlab / TOGA

TOGA (Tool to infer Orthologs from Genome Alignments): implements a novel paradigm to infer orthologous genes. TOGA integrates gene annotation, inferring orthologs and classifying genes as intact or lost.
MIT License
152 stars 23 forks source link

support for transferring miRNA annotations? #173

Open lichennan123 opened 3 months ago

lichennan123 commented 3 months ago

Hello!

I have been using TOGA lately for transferring hg38 gene annotations to a newly constructed genome and it looks good so far! However, I do notice that annotations for non-coding RNAs such as miRNAs are not applied to my query genome at all, even though my reference genome contains this information and the isoform.txt input includes ensembl gene/transcript ids for every gene type. Do you have any thought about why this might be an issue and what I can do to allow these annotations be transferred? I'd appreciate your advice! Let me know if additional details are needed.

Sincerely, Chennan

MichaelHiller commented 3 months ago

Right now, TOGA can only handle coding genes, because CESAR is a codon-aware aligner. We don't have a TOGA version that can project non-coding genes.

If you want to do some coding, you could do the following: For miRNAs that are close to genes or intronic, you could use the aligning miRNA coordinates of the orthologous chain to annotate them in the query.

lichennan123 commented 3 months ago

This is very helpful - thank you!

xfyhy commented 1 month ago

Right now, TOGA can only handle coding genes, because CESAR is a codon-aware aligner. We don't have a TOGA version that can project non-coding genes.

If you want to do some coding, you could do the following: For miRNAs that are close to genes or intronic, you could use the aligning miRNA coordinates of the orthologous chain to annotate them in the query.

Pardon me for hitchhiking. @MichaelHiller you suggested using chain file to map different genome coordinates. I noticed that liftOver can map different genome versions of the same species, but what software should be used for cross-species. - thank you!

MichaelHiller commented 1 month ago

Right, liftOver requires chains such that each base in the reference overlaps at most 1 chain (they are computed by back-converting nets to chains). Liftover can also work. But for miRNAs close to genes, TOGA's orthology calls (which chains represent orthologs for this gene) could be useful as well. Something that needs to be explored and benchmarked