how to deal with huge fastq files

databio / pypiper

Python toolkit for building restartable pipelines

BSD 2-Clause "Simplified" License

45 stars 9 forks source link

Hi @zhangzhen , pypiper wasn't really designed to do partitioning and parallelism, but rather to be applied to something that's already partitioned/chunked/etc., either naturally (e.g., biological samples) or artificially (e.g., how you could split the FASTQ arbitrarily). pepkit/looper would be how you'd normally do this sort of thing (submission of a single pypiper pipeline to multiple pieces of data). @donaldcampbelljr or @nsheff may have more recent information, though, as I've not worked in depth on the project in a while.

databio / pypiper

how to deal with huge fastq files #229