nf-core / demultiplex

Demultiplexing pipeline for sequencing data
https://nf-co.re/demultiplex
MIT License
41 stars 36 forks source link

Reads dropped when demultiplexing a sample across lanes. #182

Closed AaronNHart closed 1 month ago

AaronNHart commented 5 months ago

Description of the bug

First off, thank you for maintaining this pipeline, it looks very useful!

While informally validating the pipeline on an old study, I believe I observed that fastq files from the same sample but run from different lanes overwrite one another due to being written to the same location for each lane.

My scenario is similar to the usage documentation with a samplesheet like:

id,samplesheet,lane,flowcell foo,s3://SampleSheet.csv,1,s3://foo/ foo,s3://SampleSheet.csv,2,s3://foo/

Command used and terminal output

nextflow run nf-core/demultiplex/dev/ \
   -config ./my.config \
   --input samplesheet.csv \
   --demultiplexer bclconvert \
   --outdir s3://bucket/out/ \
   -work-dir s3://bucket/work/

Relevant files

You can see here that the file shown is shortly overwritten, the timing corresponds to the moment when each bclconvert jobs completes.

image

I prefer to not directly post the whole log file, but if you have questions about it I can pull out some details as needed.

System information

nschcolnicov commented 1 month ago

Hi @AaronNHart, this PR should have addressed the issue: https://github.com/nf-core/demultiplex/pull/225 I know it has been a long time since you reported this, but if you are still needing to execute the pipeline, could you please retry your execution using the latest version of dev?

AaronNHart commented 1 month ago

Thanks a lot for the follow up. In the mean time I've changed roles and no longer have access to this infrastructure but if I do pick up the pipeline again I'll be sure to check.