UCSC-LoweLab / tRNAscan-SE

A program for detection of tRNA genes
GNU General Public License v3.0
55 stars 7 forks source link

tRNAscan-SE: An improved tool for transfer RNA detection

Patricia Chan, Brian Lin, and Todd Lowe

School of Engineering, University of California, Santa Cruz, CA

tRNAscan-SE predicts tRNA genes from input DNA or RNA sequences in FASTA format. tRNA predictions are output in standard tabular format, secondary structure file, FASTA file, BED file, and GTF file.

tRNAscan-SE pioneers the large-scale use of covariance models to annotate tRNA genes in genomes. A covariance model is an implementation of a stochastic context-free grammar, able to integrate both primary sequence and secondary structure information, and is trained on an aligned, structurally annotated set of RNAs. Any given sequence can be searched for tRNAs by alignment to a tRNA covariance model. tRNAscan-SE 2.0 combines the use of the latest Infernal v1.1 (1) as the covariance model search engine and covariance models specifically trained and built using tRNA sequences from available genomes in the three domains of life for gene prediction. The method replaces the original use of COVE (2)
with two prefilters - tRNAscan 1.3 (3) and an implementation of an algorithm described by Pavesi and colleagues (4) for searching eukaryotic pol III tRNA promoters (our implementation referred to as EufindtRNA), which is still available as a backward compatible option. Predicted tRNA genes will then be assessed using a set of isotype-specific covariance models. Comparative analysis among these models enables better annotation, particularly of atypical tRNAs, some of which may produce recoding events due to mutations in the anticodon. The new tRNAscan-SE also enables better recognition of tRNA-derived SINEs that are abundant in many eukaryotic genomes by using a post quality filter.

This distribution includes the source code of tRNAscan-SE, the convariance models, all the files necessary to compile and run the complete COVE package (version 2.4.4), all the files necessary to compile and run the modified version of tRNAscan (version 1.4), and all the files needed to compile and run eufindtRNA 1.0 (the cove programs, tRNAscan 1.4, and eufindtRNA are included for use with the tRNAscan-SE program, but may also be run as stand-alone programs). Installation of the PERL (Practical Extraction and Report Language, Larry Wall) interpreter package version 5.0 or later is required to run the tRNAscan-SE PERL script. Users also need to download and install Infernal before installing and using tRNAscan-SE. The Infernal source package can be obtained at http://eddylab.org/infernal/.

A copy of this software can be obtained at http://trna.ucsc.edu/tRNAscan-SE/ or at https://github.com/UCSC-LoweLab/tRNAscan-SE

If you use this software, please cite

Chan, P.P., Lin, B.Y., Mak, A.J. and Lowe, T.M. (2021) "tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes", Nucleic Acids Res. 49:9077–9096. https://doi.org/10.1093/nar/gkab688

If you have any questions, bug reports, or suggestions, please e-mail

trna@soe.ucsc.edu

Lowe Lab Department of Biomolecular Engineering University of California 1156 High Street Santa Cruz, ZA 95064

References

  1. Nawrocki, E.P. and Eddy, S.R. (2013) "Infernal 1.1: 100-fold Faster RNA Homology Searches", Bioinformatics, 29, 2933-2935.

  2. Eddy, S.R. and Durbin, R. (1994) "RNA sequence analysis using covariance models", Nucl. Acids Res., 22, 2079-2088.

  3. Fichant, G.A. and Burks, C. (1991) "Identifying potential tRNA genes in genomic DNA sequences", J. Mol. Biol., 220, 659-671.

  4. Pavesi, A., Conterio, F., Bolchi, A., Dieci, G., Ottonello, S. (1994) "Identification of new eukaryotic tRNA genes in genomic DNA databases by a multistep weight matrix analysis of transcriptional control regions", Nucl. Acids Res., 22, 1247-1256.