mourisl / Rcorrector

Error correction for Illumina RNA-seq reads
GNU General Public License v3.0
63 stars 18 forks source link

Rcorrector speed optimization #18

Open sagnikbanerjee15 opened 4 years ago

sagnikbanerjee15 commented 4 years ago

Hello,

I have been running Rcorrector and it has given me very good results. But unfortunately, it takes a very long time. Is there any settings which can help increase the speed.

Thank you.

mourisl commented 4 years ago

That depends on the quality of the data. If there are more errors, it could take more time on searching the correct bases. You can try lower threshold for -maxcorK, so Rcorrector will not fix the reads with too many errors.

Another trick could be you can decompress the fastq file and gives more threads to Rcorrector. In my experience, when given more threads, the bottleneck becomes decompress/compress the fastq file.

sagnikbanerjee15 commented 4 years ago

Great thanks!

IdoBar commented 4 years ago

Hi, Coming back to this issue, is it possible to overcome the bottleneck by instructing Jellyfish to open multiple files at once as detailed in the documentation? How to read multiple files at once
I can put a PR for this when I'll have a bit of time to play with it.

Cheers, Ido