oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
336 stars 73 forks source link

Further classification #391

Closed SC-Duan closed 11 months ago

SC-Duan commented 1 year ago

Dear Ou, Could you please to give me some advice? Now I want to detect TE insertion polymorphisms (TIPs) basing on results from EDTA in my project, but the present classification of TE in EDTA just reached subfamilies such as copia and gypsy, are these could be further classified? like to Ale, Angela, Ogre? One way I thought is extracting the TE sequences and feeding to TEsorter, and do you think it is suitable to use the TEanno.split.gff3 file? Thanks a lot! Shengchang

oushujun commented 1 year ago

Hi Shengchang,

EDTA provides classification down to the superfamily level and the family level. If you like clade-level classification for LTR elements, TEsorter is a great tool for this purpose. You may classify the EDTA-generated library, so that the clade classification can be applied to all sequences labeled by the same family.

Shujun

SC-Duan commented 1 year ago

Dear Shujun, Thank you very much. I have tried the TEsorter to classify the EDTA-generated library, but the ratio of classification is really low (820/3464), I do not know whether it is normal. And more, could I classify the final results directly according to the "Name=" attribute in EDTA.anno/.fa.mod.EDTA.TEanno.split.gff3 file? this attribute should be one-to-one with the sequence's name in EDTA-generated library. Shengchang

oushujun commented 1 year ago

Shengchang,

Yes, classification based on HMM is usually low because of the lack of coding sequences in many TE sequences. If you classify each TE sequence in the genome (gff file), you may encounter the inconsistency issue, in which TEs labeled with the same "Name=" bear different TEsorter classifications. This could be due to nested insertions or misclassification of EDTA, which is difficult to distinguish.

Shujun

SC-Duan commented 1 year ago

Dear Shujun, I got it, thank you!