Closed bounlu closed 3 months ago
Can you share the .nextflow.log
?
Here you go:
I guess it would also be relevant for us to inspect the sample sheet. Can you also share that with us?
Yeah, I think the issue is unrelated to that error.
Can you try the same command and add --use_annotation_cache_keys
?
Same error here using nextflow-23.04.4-all
--use_annotation_cache_keys
makes no difference.
I am using normal samples (marked as 0 in the samplesheet) and I get:
The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : deepvariant, haplotypecaller
Process 'CNNSCOREVARIANTS' has been already used -- If you need to reuse the same component, include it with a different name or include it in a different workflow context
Can you share the log file please?
I am also getting the same error even though my sample sheet contains only normal samples (0
), not tumor samples.
can you share the log file?
I also got the Files within --vep_cache invalid
error and I fixed it by providing the VEP cache files for Ensembl 110. Strangely, this also fixed the The sample-sheet only contains tumor-samples
error.
@bounlu Can you try without snpeff,vep,merge
to see if I can rule out my hypothesis?
I also got the
Files within --vep_cache invalid
error and I fixed it by providing the VEP cache files for Ensembl 110. Strangely, this also fixed theThe sample-sheet only contains tumor-samples
error.
Yeah, I think the 2 are fully unrelated, but no idea why the The sample-sheet only contains tumor-samples
gets triggered.
Without snpeff,vep,merge
I also don't get error.
Can you share the log file please?
this is my "launcher":
export JAVA_HOME="/home/acpicornell/.sdkman/candidates/java/17.0.6-amzn"
export NXF_VER=23.04.4
/mnt/storage/$(whoami)/bin/nextflow run /mnt/storage/$(whoami)/pipelines/nf-core-sarek_3.3.1/3_3_1/\
--max_memory 100.0GB\
--max_cpus 32\
--input /mnt/storage/$(whoami)/nfcore/sarek/230908/samplesheet.csv\
--outdir /mnt/storage/$(whoami)/nfcore/sarek/230908/output\
--step mapping\
--genome GATK.GRCh38\
--igenomes_base /mnt/storage/$(whoami)/references\
--tools deepvariant,freebayes,haplotypecaller,mpileup,strelka,sentieon_haplotyper\
--aligner bwa-mem2\
--seq_platform ILLUMINA\
--wes false\
-profile singularity\
All my samples are germline and I am not annotating. This is what I get:
Process 'CNNSCOREVARIANTS' has been already used -- If you need to reuse the same component, include it with a different name or include it in a different workflow context
-- Check script '/mnt/storage/acpicornell/pipelines/nf-core-sarek_3.3.1/3_3_1/./workflows/../subworkflows/local/bam_variant_calling_germline_all/../vcf_variant_filtering_gatk/main.nf' at line: 22 or see '.nextflow.log' file for more details
The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : deepvariant, haplotypecaller
ok, so https://github.com/nf-core/sarek/releases/tag/3.3.2 should fix these issues
I still get similar error with the new version:
The sample-sheet only contains normal-samples, but the following tools, which were requested with "--tools", expect at least one tumor-sample : ascat, controlfreec, mutect2, msisensorpro
Files within --vep_cache invalid. Make sure there is a directory named homo_sapiens/110_GRCh38 in s3://annotation-cache/vep_cache/.
https://nf-co.re/sarek/usage#how-to-customise-snpeff-and-vep-annotation
The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : ascat, msisensorpro
ok, so we get all error messages at once, but I think it all lies with this error:
Files within --vep_cache invalid. Make sure there is a directory named homo_sapiens/110_GRCh38 in s3://annotation-cache/vep_cache/.
https://nf-co.re/sarek/usage#how-to-customise-snpeff-and-vep-annotation
in your case @bounlu I'd say that adding --use_annotation_cache_keys
or using a local cache should solve your issue
hey!
also got the error
Files within --vep_cache invalid. Make sure there is a directory named homo_sapiens/110_GRCh38 in s3://annotation-cache/vep_cache/.
https://nf-co.re/sarek/usage#how-to-customise-snpeff-and-vep-annotation
The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : deepvariant, haplotypecaller
when starting
nextflow run nf-core/sarek -r 3.3.2 --outdir results_sarek_3.3.2 --input parameter.csv --genome GATK.GRCh38 --tools cnvkit,deepvariant,freebayes,haplotypecaller,manta,mpileup,snpeff,vep,strelka,tiddit,merge
in my understanding the error is linked to the check in if (params.vep_cache == "s3://annotation-cache/vep_cache") { which evaluates to false in my test case
as does if (params.snpeff_cache == "s3://annotation-cache/snpeff_cache") {
if i explicitly define the parameters (before tools) when starting the pipeline both comparisons will actually evaluate to true and no confusing error is raised (should be the same value as default, but maybe different type, a groovy expert might know):
nextflow run nf-core/sarek -r 3.3.2 --snpeff_cache s3://annotation-cache/snpeff_cache --vep_cache s3://annotation-cache/vep_cache --outdir results_sarek_3.3.2 --input parameter.csv --genome GATK.GRCh38 --tools cnvkit,deepvariant,freebayes,haplotypecaller,manta,mpileup,snpeff,vep,strelka,tiddit,merge
nf-core/sarek v3.3.2-gf034b73
containerEngine : singularity
nextflow version 23.04.2.5871
CentOS Linux release 7.9.2009
Hello, It seems to be a conflict with "--tools" setting. I had the following (same) error with nf-core/sarek v3.3.2-gf034b73:
The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : deepvariant, haplotypecaller Process 'CNNSCOREVARIANTS' has been already used -- If you need to reuse the same component, include it with a different name or include it in a different workflow context
and it worked fine for me when I set:
--tools deepvariant,freebayes,sentieon_haplotyper,strelka,manta,tiddit,cnvkit
or
--tools deepvariant,haplotypecaller,freebayes,strelka,manta,tiddit,cnvkit
instead of:
--tools deepvariant,freebayes,haplotypecaller,sentieon_haplotyper,strelka,manta,tiddit,cnvkit
@achakroun can you send everything you used? Samplesheet, full command, the .nextflow.log, any custom configuration files?
@achakroun can you send everything you used? Samplesheet, full command, the .nextflow.log, any custom configuration files?
Sure
1- Command:
nextflow run nf-core/sarek -profile singularity -resume --max_cpus 10 --max_memory 40.GB --input ./samplesheet.csv --trim_fastq --genome GATK.GRCh38 --save_reference --outdir ./results --wes --intervals Twist_Exome_RefSeq_targets_hg38_200-pad.bed --tools deepvariant,haplotypecaller,freebayes,strelka,manta,tiddit,cnvkit --concatenate_vcfs
2- Samplesheet was like:
patient,status,sample,lane,fastq_1,fastq_2 841-23,0,841-23,Lane2,fastq/841-23_1.fastq.gz,fastq/841-23_2.fastq.gz 842-23,0,842-23,Lane2,fastq/842-23_1.fastq.gz,fastq/842-23_2.fastq.gz
Nothing else.
can you send the .nextflow.log
file as well
Please, note that the pipe is still running .nextflow.log
Sorry can you provide the log from the failed run?
ah pretty sure I know what is wrong. opened a separate issue to fix --> https://github.com/nf-core/sarek/issues/1314
hey!
also got the error
Files within --vep_cache invalid. Make sure there is a directory named homo_sapiens/110_GRCh38 in s3://annotation-cache/vep_cache/. https://nf-co.re/sarek/usage#how-to-customise-snpeff-and-vep-annotation The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : deepvariant, haplotypecaller
when starting
nextflow run nf-core/sarek -r 3.3.2 --outdir results_sarek_3.3.2 --input parameter.csv --genome GATK.GRCh38 --tools cnvkit,deepvariant,freebayes,haplotypecaller,manta,mpileup,snpeff,vep,strelka,tiddit,merge
in my understanding the error is linked to the check in if (params.vep_cache == "s3://annotation-cache/vep_cache") { which evaluates to false in my test case
as does if (params.snpeff_cache == "s3://annotation-cache/snpeff_cache") {
if i explicitly define the parameters (before tools) when starting the pipeline both comparisons will actually evaluate to true and no confusing error is raised (should be the same value as default, but maybe different type, a groovy expert might know):
nextflow run nf-core/sarek -r 3.3.2 --snpeff_cache s3://annotation-cache/snpeff_cache --vep_cache s3://annotation-cache/vep_cache --outdir results_sarek_3.3.2 --input parameter.csv --genome GATK.GRCh38 --tools cnvkit,deepvariant,freebayes,haplotypecaller,manta,mpileup,snpeff,vep,strelka,tiddit,merge
nf-core/sarek v3.3.2-gf034b73 containerEngine : singularity nextflow version 23.04.2.5871 CentOS Linux release 7.9.2009
this fixed the error for me.
Hey! Yeah those are two completely unrelated issues, just nf-validation both times bubbles up The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : deepvariant, haplotypecaller
this for some reason. But the real hint is in the other half of the error message
@achakroun your issue is fixed on dev
Hey, this problem has not been entirely fixed...
I am running with a local genome withignore igenomes
set and have a sample sheet with both status entered: 0 and 1.
However, this error is still occurring making the workflow un-runnable for version: nf-core/sarek v3.4.2
nextflow.exception.WorkflowScriptErrorException: The sample-sheet only contains tumor-samples, but the following tools, which were requested by the option "tools", expect at least one normal-sample : haplotypecaller
COMMAND:
nextflow \ main.nf \ -c "pfr_profile.config" \ -profile pfr,singularity \ -params-file "pfr_params.json" \ --input ${INPUT} \ --outdir "${OUT_DIR}" \ --genome null \ --igenomes_ignore \ --wes true \ --skip_tools baserecalibrator \ --tools "freebayes,haplotypecaller,mpileup" \ --fasta ${FASTA} \ --fasta_fai ${FAI} \ --dict ${DICT} \ --trim_fastq true \ --save_trimmed true \ --save_reference \ --save_mapped true \ --save_output_as_bam true \ -resume
Note, the pfr profile only specifies SLURM as the executor.
pfr_params.jason:
{ "genome": null, "igenomes_ignore": true, "save_reference": true, "split_fastq": 50000000, "trim_fastq": true, "save_trimmed": true, "aligner": "bwa-mem", "save_mapped": true, "save_output_as_bam": true }
Here is the error in the log file:
Jun-14 23:58:10.815 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: slurm Jun-14 23:58:10.815 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'slurm' Jun-14 23:58:10.831 [main] DEBUG nextflow.Session - Config process names validation disabled as requested Jun-14 23:58:10.833 [main] DEBUG nextflow.Session - Igniting dataflow network (184) Jun-14 23:58:10.844 [Actor Thread 5] ERROR nextflow.extension.OperatorImpl - @unknown java.lang.NullPointerException: Cannot get property 'baseName' on null object at org.codehaus.groovy.runtime.NullObject.getProperty(NullObject.java:60) at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:190) at org.codehaus.groovy.runtime.callsite.NullCallSite.getProperty(NullCallSite.java:46) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGetProperty(AbstractCallSite.java:329) at Script_48a2d276$_runScript_closure1$_closure2$_closure6.doCall(Script_48a2d276:132) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:38) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:53) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139) at nextflow.extension.MapOp$_apply_closure1.doCall(MapOp.groovy:56) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035) at groovy.lang.Closure.call(Closure.java:412) at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120) at groovyx.gpars.dataflow.operator.DataflowOperatorActor.onMessage(DataflowOperatorActor.java:108) at groovyx.gpars.actor.impl.SDAClosure$1.call(SDAClosure.java:43) at groovyx.gpars.actor.AbstractLoopingActor.runEnhancedWithoutRepliesOnMessages(AbstractLoopingActor.java:293) at groovyx.gpars.actor.AbstractLoopingActor.access$400(AbstractLoopingActor.java:30) at groovyx.gpars.actor.AbstractLoopingActor$1.handleMessage(AbstractLoopingActor.java:93) at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833)
Here is the sample sheet:
` patient,status,sample,lane,fastq_1,fastq_2
GC,1,GC_CAT13ANXX_TTAGGC,L001,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_A_GC_CAT13ANXX_TTAGGC_L001_R1.fastq.gz,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_A_GC_CAT13ANXX_TTAGGC_L001_R2.fastq.gz GC,1,GC_CAT13ANXX_TTAGGC,L002,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_A_GC_CAT13ANXX_TTAGGC_L002_R1.fastq.gz,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_A_GC_CAT13ANXX_TTAGGC_L002_R2.fastq.gz GC,1,GC_CAT13ANXX_TTAGGC,L003,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_A_GC_CAT13ANXX_TTAGGC_L003_R1.fastq.gz,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_A_GC_CAT13ANXX_TTAGGC_L003_R2.fastq.gz SW,0,SW_CAT13ANXX_TGACCA,L001,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_B_SW_CAT13ANXX_TGACCA_L001_R1.fastq.gz,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_B_SW_CAT13ANXX_TGACCA_L001_R2.fastq.gz SW,0,SW_CAT13ANXX_TGACCA,L002,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_B_SW_CAT13ANXX_TGACCA_L002_R1.fastq.gz,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_B_SW_CAT13ANXX_TGACCA_L002_R2.fastq.gz SW,0,SW_CAT13ANXX_TGACCA,L003,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_B_SW_CAT13ANXX_TGACCA_L003_R1.fastq.gz,/WGS/AGRF_CAGRF14129_CAT13ANXX/DH_B_SW_CAT13ANXX_TGACCA_L003_R2.fastq.gz `
@charlesdavid can you send the file for the samplesheet?
@charlesdavid
Your sample sheet appears to have "two" patients. One with the id GC
that is all tumor samples and other named SW
that is all normal. The tools that you're trying to run requires samples to have the same patient id with at least some tumor and some normal.
@maxulysse It might be good to close this issue as the most recent comments are helping troubleshoot user issues. Leaving it in the open state makes it seem that there is still something that needs to be fixed with sarek. Perhaps it would be good to improve the clarity of that error message though as it still seems to cause confusion.
This issue has been fixed, @charlesdavid your issue is now #1567. Thanks @kenibrewer for the idea
This bug emerged again after the latest merging of the dev
version into master 3.4.3
. This time it says sample-sheet only contains normal-samples:
The sample-sheet only contains normal-samples, but the following tools, which were requested with "--tools", expect at least one tumor-sample : controlfreec, mutect2
Missing process or function Channel.empty([[]])
-- Check script '/home/omeran/.nextflow/assets/nf-core/sarek/main.nf' at line: 342 or see '.nextflow.log' file for more details
This was a resume from a previous successful run with the same command line. It worked then and now it fails again.
I am guessing false flag from the error message (it for some reason shows up for unrelated issues). What is in the .nextflow.log & samplesheet
What should be the correct format of the full path to the caches? There is also discrepancy between the documentation help text and the actual cache files located on igenomes S3:
https://nf-co.re/sarek/3.4.3/parameters/#vep_cache says: ${vep_species}/${vepgenome}${vep_cache_version}
S3 has: s3://annotation-cache/vep_cache/111_GRCh38/homo_sapiens/111_GRCh38/
I am using a local cache and it seems I need to update the path format to make it work with the 3.4.3
.
I think I also encountered that problem. Fixed it locally with some symlinks.
@asp8200
Can you please share your directory tree for the cache dirs?
We already had vep_cache/110_GRCh38/homo_sapiens/110_GRCh38
downloaded from previous (old) versions of Sarek, and I had to introduce a symlink from vep_cache/homo_sapiens/110_GRCh38
to vep_cache/110_GRCh38/homo_sapiens/110_GRCh38
for Sarek v3.4.2. (I haven't tried with Sarek v3.4.3 yet.)
I suspect that @maxulysse changed the folder structure for the vep-cache (and snpeff-cache), but perhaps he can comment on that?
Description of the bug
I am getting a new type of error when running the same sample samplesheet as it used to work before:
The sample sheet contains both normal and tumor samples though.
Command used and terminal output
Relevant files
No response
System information
N E X T F L O W ~ version 23.08.1-edge local Docker Linux nf-core/sarek master v3.3.1