Closed TCLamnidis closed 8 months ago
In niche cases where multiple UDG treatments exist for a sample, and multiple libraries have each of these treatments, a file name collision kills the pipeline at the additional_library_merge step.
That's what confuses me... shouldn't they have been merged at the first post-dedup merging step? :thinking:
They are, that's the problem. as they end up with the same name. Could be an issue with the naming of the initial library merge step, OR the trimming step.
To give a better overview. Say we have a sample with 4 libraries with the following attributes: | Sample | Library | UDG_Treatment | Strandedness | Lane |
---|---|---|---|---|---|
ABC001 | A0101 | half | double | 1 | |
ABC001 | A0102 | half | double | 1 | |
ABC001 | B0101 | none | double | 1 | |
ABC001 | B0102 | none | double | 1 |
The BAMs of the first two libraries will be merged at the initial lib_merge, and be named ABC001_udghalf_libmerged.bam
.
Equally, the BAMs of the last two libraries will be merged at the initial lib_merge, and be named ABC001_udgnone_libmerged.bam
.
However, once they undergo bam trimming, the outputs lose their UDG attribute, and both become ABC001_libmerged.bam
Once the two come together for the additional_library_merge
step, the two input files share a name and the file collision pops up.
Check Documentation
I have checked the following places for your error:
Description of the bug
In niche cases where multiple UDG treatments exist for a sample, and multiple libraries have each of these treatments, a file name collision kills the pipeline at the
additional_library_merge
step.Steps to reproduce
Steps to reproduce the behaviour:
nextflow run ...
(any input that requires merging of two already merged library-level BAMs duringadditional_library_merge
step.Expected behaviour
The BAMs initial libmerged should have unique names, to avoid such errors.
Log files
Have you provided the following extra information/files:
.nextflow.log
fileSystem
Nextflow Installation
Container engine
Additional context