Closed yuanhelianyi closed 5 years ago
intact LTR_RTs file
chr3B:67330..76355 pass motif:TGCA TSD:GTGGT 67325..67329 76356..76360 IN:67801..75884 0.9597 ? Gypsy LTR 1593198 chr3B:285364..287351 pass motif:TGCA TSD:TTTAG 285359..285363 287352..287356 IN:285900..286815 1.0000 - Copia LTR 0 chr3B:532514..541248 pass motif:TGCA TSD:GTAAG 532509..532513 541249..541253 IN:533004..540758 0.9531 ? Gypsy LTR 1862714 chr3B:998081..1006756 pass motif:TGCA TSD:GATAC 998076..998080 1006757..1006761 IN:999858..1004977 0.9747 - Copia LTR 989868 chr3B:1039812..1048675 pass motif:TGCA TSD:ACGAC 1039807..1039811 1048676..1048680 IN:1040293..1048195 0.9688 ? Gypsy LTR 1225675 chr3B:1384461..1392963 pass motif:TGCA TSD:ATA 1384458..1384460 1392964..1392966 IN:1386153..1391269 0.9539 - Copia LTR 1829911 chr3B:1464448..1478224 pass motif:TGCA TSD:ACTTG 1464443..1464447 1478225..1478229 IN:1466199..1476473 0.9926 - Copia LTR 286029 chr3B:1557619..1567700 pass motif:TGCA TSD:ACCAC 1557614..1557618 1567701..1567705 IN:1558136..1567183 0.9729 ? Gypsy LTR 1061605 chr3B:1634407..1648820 pass motif:TGCA TSD:CCATC 1634402..1634406 1648821..1648825 IN:1635948..1647278 0.9760 + Copia LTR 938169 chr3B:1698438..1712708 pass motif:TGCA TSD:CCGTT 1698433..1698437 1712709..1712713 IN:1702565..1708578 0.9920 + Gypsy LTR 309345 chr3B:2243109..2253093 pass motif:TGCA TSD:CCGCT 2243104..2243108 2253094..2253098 IN:2243580..2252622 0.9873 ? Gypsy LTR 492644
Hello,
Yes, you are right. The mixed annotation is due to the use of uniq library. RepeatMasker (or rmblastn) just pick the entry that aligns closely to the query sequence. The intact LTR element structure has no guidance for this process. Due to the repetitiveness of TE sequences, their annotations are not as precise as genes.
Best, Shujun
Hi, shujun Can I think that intact LTR-RTs accompany with mixed annotation is inaccurate? Should be removed from the results of intact LTR-RTs? Zhao Jing
Hi Jing,
Not necessary. You may verify the LTR structure to confirm that. A lot of the case is an LTR element nested with other sequences, or vice versa.
Thanks, Shujun
Hi, shujun
When I get the result of whole genome annotation and all LTR-RTs. I find the two results are ambiguous, which makes me confused. I think it's the result of using the uniq lib for annotation. Is that right? Is it necessary to remove the entries that differ from annotaion file in intact LTR_RTs file? Examples are as follows, only 4 of 11 intact LTR-RTs can be annotated precisely.