nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
400 stars 404 forks source link

Samplesheets with more normal-samples per patient pass nf-validation #1293

Open asp8200 opened 12 months ago

asp8200 commented 12 months ago

Description of the bug

Samplesheets with more normal-samples per patient pass nf-validation, but they (probably) shouldn't.

https://nfcore.slack.com/archives/CGFUX04HZ/p1698056996871649?thread_ts=1697539431.495609&cid=CGFUX04HZ

The issue can be reproduced by using this csv:

$ cat my_recalibrated_somatic_joint.csv
patient,sex,status,sample,cram,crai
test,XX,0,sample1,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai
test,XX,0,sample2,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai
test,XX,1,sample3,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test3.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test3.paired_end.recalibrated.sorted.cram.crai

which is just tests/csv/3.0/recalibrated_somatic_joint.csv where sample2 has changed status from 0 (normal) to 1 (tumor).

nextflow run main.nf -profile test_cache,tools_somatic,docker --tools mutect2 --outdir results --input my_recalibrated_somatic_joint.csv

@FriederikeHanssen knows what this is about ;-)

Command used and terminal output

nextflow run main.nf -profile test_cache,tools_somatic,docker --tools mutect2 --outdir results --input my_recalibrated_somatic_joint.csv

....

Detected join operation duplicate emission on right channel -- offending element: key=test; value=[patient:test, sample:sample2, sex:XX, status:0, id:sample2, data_type:cram],/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram,/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai

Relevant files

No response

System information

No response

onurcanbektas commented 5 months ago

I have the same problem as well.