(Please follow this template replacing the text between parentheses with the requested information)
Expected behavior and actual behavior
The current Nextflow docs for splitFastq states:
Finally the splitFastq operator is able to split paired-end read pair FASTQ files. It must be applied to a channel which emits tuples containing at least two elements that are the files to be split.
while the description for the pe argument states:
When true splits paired-end read files, therefore items emitted by the source channel must be tuples in which at least two elements are the read-pair files to be split.
This implies when splitFastq is used with pe: true, it is expected to split an unlimited number of FASTQ files for each entry of the channel. However, as from the output below, only the first two files are split. This wasn't a problem (yet) in 2019, but becomes a problem now due to some single-cell sequencing platforms require 3 FASTQ files as input.
However, a fix requires the operator to be able to read from at least one entry of the source channel to determine indexes. However, I don't know enough Groovy/Java to know if this is at all possible. If not, then just change the documentation.
Agree that it would be nice to be able to generate a tuple of an arbitrary number of files if possible (just taking the number of elements in the squiggly brackets)
Bug report
(Please follow this template replacing the text between parentheses with the requested information)
Expected behavior and actual behavior
The current Nextflow docs for
splitFastq
states:while the description for the
pe
argument states:This implies when
splitFastq
is used withpe: true
, it is expected to split an unlimited number of FASTQ files for each entry of the channel. However, as from the output below, only the first two files are split. This wasn't a problem (yet) in 2019, but becomes a problem now due to some single-cell sequencing platforms require 3 FASTQ files as input.Steps to reproduce the problem
Program output
Environment
Additional context
I currently think the issue is in the following code block in
SplitOp.groovy
, currently in lines 92-96, which hard-codes the indices:However, a fix requires the operator to be able to read from at least one entry of the
source
channel to determineindexes
. However, I don't know enough Groovy/Java to know if this is at all possible. If not, then just change the documentation.test_S1_L001_I1_001.fastq.gz test_S1_L001_R1_001.fastq.gz test_S1_L001_R2_001.fastq.gz .nextflow.log
(EDIT: Updated with possible cause.)