Open golobor opened 6 years ago
it's actually a bug - @golobor add a label - to stick attention
@sergpolly remind me again - how is this a bug?.. fastqs are way too big to be distributed uncompressed, it's safe to assume that they are gzipped. As for the splitFastq - it does work, but we do have more control using custom chunking processes, e.g. we do not duplicate data the way splitFastq does, which is a major factor for big projects.
@golobor it was someone in the lab, or elsewhere (maybe even myself) who tried to feed uncompressed fastq-s into distiller - and error or behaviour seems rather cryptic at that time.
I understand that this is a ridiculous scenario but nonetheless. At least if we keep this issue someone might find out about it and we would not forget to mention it in the docs
it happens here:
where we feed input
${fastq1/2}
throughzcat
without checking if it's zipped or not...check the suffix somehow , or ... simply throw an error , if we don't want to deal with anything than zipped fastqs.
Also todo: i'd like to see how cextflow's splitFastqs (https://www.nextflow.io/docs/latest/operator.html#splitfastq) works - maybe it's timely to test along with fixing this bug