nerettilab / RepEnrich2

RepEnrich2 is an updated method to estimate repetitive element enrichment using high-throughput sequencing data.
36 stars 9 forks source link

Run time #3

Closed rtmag closed 6 years ago

rtmag commented 6 years ago

Hello!

Thanks for the update and for migrating the software to Bowtie2, it gives me a much better mapping efficiency.

However, the RepEnrich2.py quantification, after alignment and subset, is taking way too long to finish. About 2 weeks per sample running with 25 CPUs.

My samples have about 75,788,322 uniquely mapped paired-end reads and 12,126,132 multimappers.

Is this normal?

Best, R

nskvir commented 6 years ago

Hi there, Apologies for the delay in reply, I sometimes don't receive notifications from Github when issues are opened. This behavior is not typical compared to what we usually see. I asked a lab member for some samples she has run so that I could look at relative file size (our samples usually take between half a day and a little over a day to run).

I notice that with larger file size and a greater number of mapped reads that the run time increases, and it seems to increase more when the proportion of uniquely mapping reads is higher, which I notice is the case in your samples. I'm unsure why the time increase for you is so drastic however, as we have run samples with slightly larger total mapping reads and have never had it approach that run time.

repenrich2_runtimes