oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
177 stars 40 forks source link

What is the meaning of the full, left, right, total, and all in whole-genome LTR-RT annotation files? #31

Closed yuanhelianyi closed 5 years ago

yuanhelianyi commented 5 years ago

Hi, shujun

When I use LTR_retriever to analyze, I am confused about the output files based on whole-genome LTR-RT annotation by the non-redundant library. I didn't know the information of full, left, right, total, and all passed. Please introduction their meaning.

Zhao Jing

2018-12-10 6 16 47
oushujun commented 5 years ago

Hi Jing,

Sorry for the poor naming. These entries should be:

RepeatMasker_entry TE_family Full_length Left_end_only Right_end_only Converted_copy_number Total_entries Total_length_in_bp Whole_genome_percentage Class Subclass Note

Please let me know if these names are not clear.

Best, Shujun

yuanhelianyi commented 5 years ago

Hi shujun, Thank you very much for your explanation. But I still can't understand the relationship between Full_length, Left_end_only, Right_end_only and Coverted_copy_number. I checked the script you called and found "$total = int ( $full{$key} + ($left{$key}+$right{$key})/2 + 0.5)". Why is this calculated? And I also confused about the script as follows, why do that not only "right_end = 1" but also "left_end = 1" when "$TE_head <= 20"? Zhao Jing

2018-12-12 11 32 44
oushujun commented 5 years ago

Hi Jing,

Full_length is the query sequence aligned with the full length of the annotation sequence. Left_end_only and Right_end_only are to count the partial alignment of the annotation sequence, which left-end or right-end is aligned to the query, respectively. The Coverted_copy_number is to account for partial alignment. For the code you indicate, there is a bug, thanks for catching that! The corrected code should be:

           if ($strand eq "+") {
                   $left_end = 1 if ($TE_head <= 20);
                   $right_end = 1 if ($TE_left <= 20);
                   }
           else {
                   $right_end = 1 if ($TE_head <= 20);
                   $left_end = 1 if ($TE_left <= 20);
                   }

Please correct it for yourself, I will update the repository later.

Thanks, Shujun