oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
323 stars 71 forks source link

TIR-Learner results aren't DNA transposons #133

Open coopergrace opened 3 years ago

coopergrace commented 3 years ago

Hi Shujun,

I'd previously used EDTA when you included MITE-Tracker (or MITE Hunter? I'm not sure which) and got some decent results for a de novo yeast species. Having seen a lot of poor reports regarding MITE-Tracker/Hunter, I decided to retry EDTA now that you've switched to TIR-Learner. However, it now returns far more supposedly full-length transposons than I thought were present, and next to no MITEs, whereas my previous results were 5-10 transposons and 5k+ MITEs. Checking the transposons with an ORF finding program has revealed that >90% are simply host genes that just so happen to have short TIR-like flanking sequences. Even re-running TIR-Learner with a basic transposon library from a distantly related species doesn't seem to reduce the number of false positives.

Any suggestions you have would be appreciated. I thought I'd ask here just in case, as the TIR-Learner page on here is pretty quiet and you've been very quick to help in the past.

Cheers

oushujun commented 3 years ago

Hi @coopergrace ,

Sorry for the delayed response. EDTA was using MITE-Hunter in its beta versions to identify MITEs (TIR transposons shorter than 600bp) but we found it running slowly especially for large genomes and it's not contributing new TIR candidates other than TIR-Learner results. So I have it removed in the official release of EDTA. As you mentioned, TIR-Learner may report too many candidates and require further filtering. Your way of filtering based on ORF finding or gene blasting should be efficient. Another way is to filter out low-copy elements as those are usually false positives (and you will lose low-copy MITEs accordingly).

The current version of EDTA does not report MITEs because I find it confuses people from time to time (there were two categories in the reporting summary, eg. MITE - DTA and TIR - DTA). Moreover, the definition of MITE is somewhat arbitrary - ≤600bp TIR elements. So I name all TIRs as TIRs. You may use the old criteria to reclassify some TIRs back to MITEs for your need. Let me know if I can further help.

Best, Shujun

awaisfarooq724 commented 1 year ago

I have collected the results from EDTA and could not find MITEs, is it possible that you can help me to use old criteria of how to reclassify TIRs back to MITEs. Do I have to rerun the older version of EDTA or should I use MITE-Hunter separately. I am naive when it comes to using the softwares, can you please help me to get this done in the shortest possible way?

oushujun commented 1 year ago

Hi, the current EDTA does not have a MITE-hunter or other similar MITE-specialty programs. This is because TIR-Learner can identify most MITE and we found no benefit of adding an extra MITE program in EDTA. The intact TIR elements shorter than 600 bp are classified as MITEs.

Best, Shujun

On Thu, Aug 4, 2022 at 2:34 AM oscar winner @.***> wrote:

I have collected the results from EDTA and could not find MITEs, is it possible that you can help me to use old criteria of how to reclassify TIRs back to MITEs. Do I have to rerun the older version of EDTA or should I use MITE-Hunter separately. I am naive when it comes to using the softwares, can you please help me to get this done in the shortest possible way?

— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/133#issuecomment-1204824381, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NDQ73L4OX2SZOFXTXLVXNQAHANCNFSM4T3JEX6Q . You are receiving this because you commented.Message ID: @.***>

awaisfarooq724 commented 1 year ago

Dear Oushujun, thanks for the response, I think it will suffice it for me.