zhangrengang / TEsorter

TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes
https://doi.org/10.1093/hr/uhac017
GNU General Public License v3.0
85 stars 19 forks source link

Exploring the transposition profile of specific LTRs #54

Open shuren7 opened 2 months ago

shuren7 commented 2 months ago

Dear Professor Zhang, @zhangrengang I apologize for disturbing you in your busy schedule. I know that LTR_Retriever and TEsorter are paired together, but the author of LTR_Retriever, Shujun Ou, seems to be very busy these days, so there's a question I'm concerned about that I think I have to ask you.

The issue I am most concerned about is the issue regarding the determination of the number of transpositions for a specific LTR-RT. And how to identify the parent or child of a specific LTR? Quantitative expression of some LTRs by TEtranscripts indicates that they are active LTRs, but I am curious how many times they have undergone transposition in a single genome. Please note that I am currently only targeting a single genome.

This involves the issue of LTR_retriever result files. What I don't quite understand is what "LTRlib.redundant.fa" specifically refers to, and I can't find an explanation in many places. Does "reduntant" refer to all existing LTRs? And "LTRlib.fa" refers to representative LTRs after removing duplicates?

I think this involves the search for the mother of LTRs. Because I focus on the possible origins of specific LTRs and "how many copies it has formed in the genome"; especially highly active LTRs and their transposition patterns.

I tried to use BLASTN to find the parent of individual LTRs (not sure if it is correct), and the library searched was "LTRlib.redundant.fa". The results are shown below. I don’t know if it is possible to determine whether the paired LTRs with “identity greater than 99% and length coverage greater than 99%” are the parent of the search LTR or the individual resulting from transposition? Or should I do BLASTN on genomic DNA? image

Many LTRs will leave fragments of LTR after transposition, but I may not consider this situation for now. However, it may still depend on your opinion. In short, briefly, I just want to know the copy number of specific LTRs.

What suggestions do you have for this? I would be very grateful if you could answer; your answer is very important to me! Sincerely! Shuren

zhangrengang commented 2 months ago

I am sorry that I am not sure how to identify the mother of a LTR-RT. Perhaps a phylogeny-based method (https://github.com/zhangrengang/TEsorter?tab=readme-ov-file#further-phylogenetic-analyses) can work. If your specific LTR-RT clusters with other unactive LTR-RTs, they maybe its mother or sisters or children?

What I don't quite understand is what "LTRlib.redundant.fa" specifically refers to, and I can't find an explanation in many places. Does "reduntant" refer to all existing LTRs? And "LTRlib.fa" refers to representative LTRs after removing duplicates?

I think you are right.