Closed AlisaGU closed 8 months ago
Hi,
Thank you for your interest in the software. The running time for TElocal once the TE index is built should be within 3-4 hrs based on your BAM file, and probably requires maximum 40G of RAM (though sometimes it could spike higher). It is the index building that can take days depending on the number of TE annotations. We are trying to speed up this process in the next release (and potentially removing the need to prebuilt indices), but for now, that is definitely the bottleneck.
Thanks.
Wow, so quick a reply!
There are 60108975 TE in the genome. Can I split it into several files to build index and count reads separately?
Hi,
Unfortunately, we don't recommend splitting the TE index, as it does cause issues in the EM. The counting though would be relatively quick (order of hours), and you can certainly count in parallel with TElocal. Unfortunately, pre-building the TE index will take quite a long time (maybe more than 7 days), but once done, you won't have to deal with it again and can just count.
Thanks.
OK~, thanks for your quick and detailed answer.
Hi, my index program has running for 7 days and 18hour without any result output. Is it normal?
This is my code:
$TElocal_indexer --afile DN.denovo.RepeatMasker.Telocal.gtf --itype TE
Hi,
Unfortunately, depending on how big your TE GTF file is, the indexing step takes quite a while. I'm afraid it could take more days still.
We are currently developing a version of TElocal
that could bypass/speed up this step.
Thanks.
Is it expected that no result was outputted during the process?
Unfortunately, yes, because everything is being processed in memory. We had considered improving the logging, but this was technically a script that we were using in-house, and with our efforts going towards removing the index requirement, we decided not to add more to the indexing script.
Thanks.
Thanks for your continuous efforts! Looking forward to using the improved version!
Hi, just like the title, is there a way to speed up TElocal?
I am building the index of TE annotation, so no clear concept about the running time. However, my genome is big and RNA-seq bam is about 40G. The running time is definitely long.
Could you give me some tips in preparation for the counting step?