Open pawanchk opened 12 months ago
Hi,
The definition of the size
column in the Anticodon
output file is the same as in Isodecoder
, i.e. the number of sequences in the reference file that have a specific anticodon.
The alignment settings for GSNAP are in the align.log file - you will see there that the alignment mode is set to default, i.e. both forward and reverse strand are considered.
I'm not sure I understand what you mean by length information of the tRNA sequences in the input data - if this refers to the length of the mapped reads, this info can be obtained from the bam files in /align. Another useful file is RTstopTable.csv
in /mods: this includes tRNA/cluster, canonical tRNA position and proportion of reads that stop at each position (normalized to total coverage of the reference sequence). This gives the relative frequency of reads stopping at all positions for a reference, the sum of which should equal 1.
Hi,
Thank you for your detailed response - this is very helpful.
For additional clarification - please let me provide more details for my questions 2 and 3 -
Regarding the alignment mode - are you referring to this from GSNAP
?
--mode=STRING Alignment mode: standard (default), cmet-stranded, cmet-nonstranded,
atoi-stranded, atoi-nonstranded, ttoc-stranded, or ttoc-nonstranded.
Non-standard modes requires you to have previously run the cmetindex
or atoiindex programs (which also cover the ttoc modes) on the genome
Homo_sapiens_tRNA-Ala-CGC
, we get counts in the Anticodon_counts_raw.txt
file, how much of this tRNA sequence is found in the input data, is it mapped full length or only part of it is mapped ? Where can I find this information among the output files ?
Hi,
I have some questions regarding the output files generated after the mim-tRNAseq analysis -
In the raw counts file
counts/Anticodon_counts_raw.txt
- what does the last columnsize
refer to ? In the manual https://mim-trnaseq.readthedocs.io/en/latest/output.html, I noticed that explanation is size is given for theIsocoder
output file, but not for theAnticodon
output file.If I use reverse complement of the fastq files as input, then the output counts does not change - can I please know if the program considers reverse complement already ?
Among the output files, where can I find the length information of the tRNA sequences that are found from the input data ?
I look forward to hearing from you.