oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
176 stars 40 forks source link

Total length annotated by RepeatMasker is longer than LTR_retriever #143

Closed th2ch-g closed 1 year ago

th2ch-g commented 1 year ago

Dear, oushujun

Thank you for creating a great tool.

This issue is question, not bug.

My question is that total length of LTRs annotated by RepeatModeler(with -LTRStruct) and RepeatMasker is much longer than the total length of LTRs annotated by LTR_retriever.

Is this happening because LTR_re only annotates intact LTRs ?

Sincerely, th

oushujun commented 1 year ago

Hi th,

LTR_retriever identifies intact LTR elements and constructs a non-redundant library to annotated fragmented LTR sequences. By saying much shorter length, did you mean intact only or both intact and fragmented LTR sequences?

Shujun

th2ch-g commented 1 year ago

Dear, oushujun.

Thank you for replying.

I meant intact LTR only. I think that is why LTR annotated by LTR_retriever is much shorter.

By the way, I calculated annotated LTR from .pass.list(LTR_retriever output file) and .out(RepeatMasker output file).

Sincerely, th

oushujun commented 1 year ago

Hello th,

LTR_retriever annotates whole-genome LTR sequences (intact and fragmented), which could be found in the .out.gff file. The .pass.list file only contains intact LTR elements. The .out(RepeatMasker output file) should contain all LTR sequences (intact and fragmented). So you should compare the LTR_retriever .out.gff file with the RepeatMasker .out file.

Best, Shujun