nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
404 stars 410 forks source link

[BUG] "pipeline completed with errors" - but no error given #491

Closed markdunning closed 2 years ago

markdunning commented 2 years ago

Hi,

I am running sarek for the first time, and it is failing quite early. However, I'm not sure what the error is:-

WARN: Found unexpected parameters:
* --step-mapping: true
- Ignore this warning: params.schema_ignore_params = "step-mapping" 

For input string: ""status""

 -- Check script '/home/md1mjdx/.nextflow/assets/nf-core/sarek/main.nf' at line: 4183 or see '.nextflow.log' file for more details
Core Nextflow options
  revision              : master
  runName               : confident_euler
  containerEngine       : singularity
  container             : nfcore/sarek:2.7.1
  launchDir             : /mnt/fastdata/md1mjdx/LM
  workDir               : /mnt/fastdata/md1mjdx/LM/work
  projectDir            : /home/md1mjdx/.nextflow/assets/nf-core/sarek
  userName              : md1mjdx
  profile               : singularity
  configFiles           : /home/md1mjdx/.nextflow/assets/nf-core/sarek/nextflow.config, /mnt/fastdata/md1mjdx/Luke_Mansfield/nextflow.config

Input/output options
  input                 : input_7884.tsv
  outdir                : results_7884

Main options
  tools                 : null
  skip_qc               : null

Trim/split FASTQ
  clip_r1               : 0
  clip_r2               : 0
  three_prime_clip_r1   : 0
  three_prime_clip_r2   : 0
  trim_nextseq          : 0

Preprocessing
  markdup_java_options  : "-Xms4000m -Xmx7g"

Variant Calling
  ascat_ploidy          : null
  ascat_purity          : null
  cf_contamination      : null
  cf_ploidy             : 2
  read_structure1       : null
  read_structure2       : null

Annotation
  annotate_tools        : null
  cadd_indels           : false
  cadd_indels_tbi       : false
  cadd_wg_snvs          : false
  cadd_wg_snvs_tbi      : false
  snpeff_cache          : null
  vep_cache             : null

Reference genome options
  genome                : GRCz10
  bwa                   : s3://ngi-igenomes/igenomes//Danio_rerio/Ensembl/GRCz10/Sequence/BWAIndex/genome.fa.{amb,ann,bwt,pac,sa}
  fasta                 : s3://ngi-igenomes/igenomes//Danio_rerio/Ensembl/GRCz10/Sequence/WholeGenomeFasta/genome.fa
  igenomes_base         : s3://ngi-igenomes/igenomes/
  genomes_base          : null

Generic options
  max_multiqc_email_size: 25 MB
  sequencing_center     : null

Max job request options
  single_cpu_mem        : 7 GB
  max_cpus              : 4
  max_memory            : 64.GB
  max_time              : 10d

------------------------------------------------------
 Only displaying parameters that differ from defaults.
------------------------------------------------------
Pipeline Release  : master
Run Name          : confident_euler
Max Resources     : 64.GB memory, 4 cpus, 10d time per job
Container         : singularity - nfcore/sarek:2.7.1
Output dir        : results_7884
Launch dir        : /mnt/fastdata/md1mjdx/LM
Working dir       : /mnt/fastdata/md1mjdx/LM/work
Script dir        : /home/md1mjdx/.nextflow/assets/nf-core/sarek
User              : md1mjdx
Input             : input_7884.tsv
Step              : mapping
Genome            : GRCz10
Nucleotides/s     : 1000
MarkDuplicates    : Options
Java options      : "-Xms4000m -Xmx7g"
GATK Spark        : No
Save BAMs mapped  : No
Skip MarkDuplicates: No
AWS iGenomes base : s3://ngi-igenomes/igenomes/
Save Reference    : No
BWA indexes       : s3://ngi-igenomes/igenomes//Danio_rerio/Ensembl/GRCz10/Sequence/BWAIndex/genome.fa.{amb,ann,bwt,pac,sa}
fasta reference   : s3://ngi-igenomes/igenomes//Danio_rerio/Ensembl/GRCz10/Sequence/WholeGenomeFasta/genome.fa
Publish dir mode  : copy
Config Profile    : singularity
Config Files      : /home/md1mjdx/.nextflow/assets/nf-core/sarek/nextflow.config, /mnt/fastdata/md1mjdx/LM/nextflow.config
----------------------------------------------------
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /mnt/fastdata/md1mjdx/LM/work/singularity -- Use env variable NXF_SINGULARITY_CACHEDIR to specify a different location
Pulling Singularity image docker://nfcore/sarek:2.7.1 [cache /mnt/fastdata/md1mjdx/LM/work/singularity/nfcore-sarek-2.7.1.img]
-[nf-core/sarek] Pipeline completed with errors-
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.

I looked at .nextflow.log and see the following:-

Feb-25 19:40:04.140 [main] DEBUG nextflow.Session - Session await > all process finished
Feb-25 19:40:04.140 [main] DEBUG nextflow.Session - Session await > all barriers passed
Feb-25 19:40:05.386 [Actor Thread 12] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 1; slices: 1; internal sort time: 0.001 s; external sort time: 0.251 s; total time: 0.252 s
Feb-25 19:40:05.389 [Actor Thread 12] DEBUG nextflow.file.FileCollector - >> temp file exists? false
Feb-25 19:40:05.389 [Actor Thread 12] DEBUG nextflow.file.FileCollector - Missed collect-file cache -- cause: java.nio.file.NoSuchFileException: /scratch/8740858.1.all.q/7da7133bd4cee5a77cf7581d5b124130.collect-file
Feb-25 19:40:05.433 [Actor Thread 12] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /scratch/8740858.1.all.q/7da7133bd4cee5a77cf7581d5b124130.collect-file
Feb-25 19:40:05.496 [Actor Thread 12] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /scratch/8740858.1.all.q/nxf-3812309998024783298
Feb-25 19:40:05.780 [main] INFO  nextflow.Nextflow - -[nf-core/sarek] Pipeline completed with errors-
Feb-25 19:40:05.798 [main] DEBUG nextflow.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=0; failedCount=0; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=0ms; failedDuration=0ms; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=0; peakCpus=0; peakMemory=0; ]
Feb-25 19:40:05.803 [main] DEBUG nextflow.trace.TraceFileObserver - Flow completing -- flushing trace file
Feb-25 19:40:05.821 [main] DEBUG nextflow.trace.ReportObserver - Flow completing -- rendering html report
Feb-25 19:40:05.822 [main] DEBUG nextflow.trace.ReportObserver - Execution report summary data:
  []
Feb-25 19:40:08.252 [main] DEBUG nextflow.trace.TimelineObserver - Flow completing -- rendering html timeline
Feb-25 19:40:08.620 [main] WARN  nextflow.dag.GraphvizRenderer - To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.622 [main] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Feb-25 19:40:08.626 [main] DEBUG nextflow.CacheDB - Closing CacheDB done
Feb-25 19:40:08.629 [main] DEBUG nextflow.util.SpuriousDeps - AWS S3 uploader shutdown
Feb-25 19:40:08.807 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

Is it something to do with a temp directory? Do I need to create this?

This is the command I used to run.

export SINGULARITY_CACHEDIR=${PWD}/singularity-cache
export SINGULARITY_TMPDIR=${PWD}/singularity-tmp
module load apps/java

/shared/bioinformatics_core1/Shared/software/nextflow/v20.11.0-edge/nextflow run nf-core/sarek \
                            -resume \
                            --input input_8574.tsv \
                            --step-mapping \
                            --genome GRCz10  \
                            --outdir results_8574 \
                            -profile singularity \
                            --max_memory 64.GB \
                            --max_cpus 4

Thanks in advance for any help

FriederikeHanssen commented 2 years ago

Hi @markdunning ! Could you add the .nextflow.log ?

markdunning commented 2 years ago

I think the error has something to do with the status column in the input.tsv file

Feb-25 19:39:27.879 [main] WARN  nextflow.Nextflow - Found unexpected parameters:
* --step-mapping: true
Feb-25 19:39:27.880 [main] INFO  nextflow.Nextflow - - Ignore this warning: params.schema_ignore_params = "step-mapping" 
Feb-25 19:39:28.532 [Actor Thread 3] ERROR nextflow.extension.OperatorEx - @unknown
java.lang.NumberFormatException: For input string: ""status""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.valueOf(Integer.java:766)
    at org.codehaus.groovy.runtime.StringGroovyMethods.toInteger(StringGroovyMethods.java:3065)
    at org.codehaus.groovy.runtime.dgm$1275.invoke(Unknown Source)
    at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:247)
    at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:130)
    at Script_c3d27a41$_extractFastq_closure8.doCall(Script_c3d27a41:4183)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1026)
    at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:38)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
    at nextflow.extension.MapOp$_apply_closure1.doCall(MapOp.groovy:57)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1026)
    at groovy.lang.Closure.call(Closure.java:412)
    at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120)
    at groovyx.gpars.dataflow.operator.DataflowOperatorActor.onMessage(DataflowOperatorActor.java:108)
    at groovyx.gpars.actor.impl.SDAClosure$1.call(SDAClosure.java:43)
    at groovyx.gpars.actor.AbstractLoopingActor.runEnhancedWithoutRepliesOnMessages(AbstractLoopingActor.java:293)
    at groovyx.gpars.actor.AbstractLoopingActor.access$400(AbstractLoopingActor.java:30)
    at groovyx.gpars.actor.AbstractLoopingActor$1.handleMessage(AbstractLoopingActor.java:93)
    at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Feb-25 19:39:28.594 [main] DEBUG nextflow.file.FileHelper - Creating a file system instance for provider: S3FileSystemProvider
Feb-25 19:39:28.623 [Actor Thread 3] DEBUG nextflow.Session - Session aborted -- Cause: For input string: ""status""
Feb-25 19:39:28.682 [main] DEBUG nextflow.file.FileHelper - AWS S3 config details: {}
Feb-25 19:39:28.789 [Actor Thread 3] DEBUG nextflow.Session - The following nodes are still active:
  [operator] map
  [operator] map

Here's the first three columns of the input file. Does it want quote marks around the status column?


"subject"   "sex"   "status"    "sample"    "lane"
"A1_founder_S37"    "XX"    0   "A1"    1   
"A2_founder_S38"    "XX"    0   "A2" 1

``
FriederikeHanssen commented 2 years ago

ah yes, remove the header line and the quotes :) The TSV doesn't expect a header

markdunning commented 2 years ago

yes, that was the issue. Thanks a lot