Closed sanjitsbatra closed 6 years ago
We are working on a faster step1 version, but 1T of reads is probably going to kill other parts of the software anyway. Can you comment on genome and coverage? It is either a super challenging project that I would love to hear about or you can probably just downsample a lot...
Is there any way to parallelize this process? It would seem that one could seek blocks in parallel and load in memory, right?
Hey! I have a dataset with about 1T of reads in fastq format. This takes about a week to load into memory in the first step. Is there anyway to quicken this process?