oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
331 stars 72 forks source link

TIR-Learner is slow with many contigs #308

Open mergi-2674 opened 1 year ago

mergi-2674 commented 1 year ago

I would like to appreciate for the best pipeline you developed so far! However, I was running EDTA on the plant genome which has a size of 3GB and EDTA was running for a week on a pc with 380GB of RAM and 32 threads. It didn't gave me the final result ans stacked on TIR step. How long does it takes to finish the analysis?

Best Mergi

Screenshot from 2022-10-19 11-41-36

oushujun commented 1 year ago

Yeah the TIR step will be slow if you have a very fragmented genome. TIR-Learner will get slow down for thousands of contigs. You may want to remove the small contigs and rerun or keep waiting... Alternatively, you can concatenate the small contigs into Chr0 which will much speed up the run.

Shujun

mergi-2674 commented 1 year ago

Hi Shujun Thank you for replay. I have a genome with thousands of contigs and that is why the TIR step takes more longer. I wanted to concatenate the small contigs into chr0 but i don't know how. Could you help me with this please?

Best, Mergi

oushujun commented 1 year ago

Hi Mergi,

You may search around and make your own. For example, you can start from here: https://www.biostars.org/p/379245/

Best, Shujun