epi2me-labs / wf-human-variation

Other
86 stars 41 forks source link

Missing output file(s) `SAMPLE.haplocheck.tsv` expected by process `haplocheck (1)` #188

Open nikelau opened 3 weeks ago

nikelau commented 3 weeks ago

Operating System

Windows 11

Other Linux

No response

Workflow Version

2.2.1

Workflow Execution

EPI2ME Desktop (Local)

Other workflow execution

No response

EPI2ME Version

EPI2ME V5.1.10

CLI command run

No response

Workflow Execution - CLI Execution Profile

None

What happened?

I guess there is a missing file "SAMPLE.haplocheck.tsv" at the step "publish_artifact"? Any insights would be much appreciated. Thanks!

Relevant log output

This is epi2me-labs/wf-human-variation v2.2.1.
--------------------------------------------------------------------------------
Searching input for [.bam, .ubam] files.
[ed/1a0c8d] Submitted process > getParams
[bd/1ac47a] Submitted process > str:getVersions
[2c/13fac7] Submitted process > index_ref_fai (1)
[34/2714d3] Submitted process > report_snp:getVersions
[e2/b3b38b] Submitted process > cnv_spectre:getParams
[2e/ea53c2] Submitted process > cnv_spectre:getVersions
[b8/ef62e1] Submitted process > sv:runReport:getVersions
[e2/b5f309] Submitted process > report_snp:getParams
[5b/a73bfd] Submitted process > str:getParams
[ea/ff432a] Submitted process > sv:runReport:getParams
[22/4b2c5f] Submitted process > getVersions
[44/ccbf3f] Submitted process > cram_cache (1)
[02/7a163f] Submitted process > ingress:checkBamHeaders (1)
[71/dbe98e] Submitted process > cnv_spectre:add_snp_tools_to_versions
[0e/d7c53a] Submitted process > ingress:catSortBams (1)
[71/d7d392] Submitted process > getAllChromosomesBed (1)
[cf/fbb50a] Submitted process > ingress:check_for_alignment (1)
[ff/79a6a0] Submitted process > ingress:minimap2_alignment (1)
[b1/59b824] Submitted process > validate_modbam (1)
[a7/32f483] Submitted process > lookup_clair3_model (1)
[87/ee4640] Submitted process > readStats (1)
[b2/54ce3f] Submitted process > haplocheck (1)
[0b/ad716e] Submitted process > mosdepth_input (1)
[bd/fbd01b] Submitted process > getGenome (1)
[cd/a21486] Submitted process > cnv_spectre:mosdepth (1)
[a1/9e42df] Submitted process > configure_jbrowse (1)
Autoselected Clair3 model: r941_prom_sup_g5014
[47/061157] Submitted process > snp:make_chunks (1)
[6f/fc74cf] Submitted process > publish_artifact (1)
ERROR ~ Error executing process > 'haplocheck (1)'
Caused by:
  Missing output file(s) `SAMPLE.haplocheck.tsv` expected by process `haplocheck (1)`
Command executed:
  # Extract mito-genome reads first
  samtools view -@1 -hb SAMPLE.cram chrM MT Mt > SAMPLE
  samtools index -@2 SAMPLE

  # Get MT sequence IDs. This is needed because, providing inexisting
  # sequence IDs to `samtools faidx` causes to create new empty sequences
  # in the output FASTA file.
  samtools view SAMPLE | cut -f 3 | sort | uniq > seqs.txt
  has_mt=$( wc -l seqs.txt | awk '{print $1}' )

  # Run the commands if there are reads to process
  if [ $has_mt -gt 0 ]; then
      # Extract regions from the fasta file
      samtools faidx -r seqs.txt hg38.analysisSet.fa > mt.fa && samtools faidx mt.fa

      # Run mutserve
      java -jar `which mutserve.jar` call                 --level 0.01                 --reference mt.fa                 --mapQ 20                 --baseQ 20                 --output mt.vcf.gz                 --no-ansi                 --threads 2                 SAMPLE

      # Run haplocheck
      haplocheck --out SAMPLE.haplocheck.tsv mt.vcf.gz
  # If no MT is found, then save it as NV (no value) as opposed to ND (not determined)
  else
      echo "Sample  Contamination Status    Contamination Level Distance    Sample Coverage" > SAMPLE.haplocheck.tsv
      echo "SAMPLE  NO  NV  0   0" >> SAMPLE.haplocheck.tsv
  fi
Command exit status:
  0
Command output:

  mtDNA Variant Detection v2.0.0-rc12
  https://github.com/seppinho/mutserve
  (c) Sebastian Schoenherr, Hansi Weissensteiner, Lukas Forer
  [call, --level, 0.01, --reference, mt.fa, --mapQ, 20, --baseQ, 20, --output, mt.vcf.gz, --no-ansi, --threads, 2, SAMPLE]
  [Run]     SAMPLE...
  [Done]    SAMPLE. Execution Time: 00:00:00
  [Run]     Merge output files...
  [Done]    Merge output files. Execution Time: 00:00:00

  haplocheck 1.3.3
  https://github.com/genepi/haplocheck
  (c) 2020 Sebastian Schoenherr, Hansi Weissensteiner, Lukas Forer

  Check for Contamination.. 
  RUN Load file...
  ERROR Contamination failed
Command error:
  [main_samview] region "MT" specifies an invalid region or unknown reference. Continue anyway.
  [main_samview] region "Mt" specifies an invalid region or unknown reference. Continue anyway.
  Picked up JAVA_TOOL_OPTIONS: -Xlog:disable -Xlog:all=warning:stderr
  java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: A
    at htsjdk.variant.variantcontext.VariantContext.makeAlleles(VariantContext.java:1536)
    at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:468)
    at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:647)
    at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:638)
    at genepi.mut.pileup.VcfWriter.createVCF(VcfWriter.java:213)
    at genepi.mut.commands.VariantCallingCommand.call(VariantCallingCommand.java:212)
    at genepi.mut.commands.VariantCallingCommand.call(VariantCallingCommand.java:28)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
    at picocli.CommandLine.access$1100(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159)
    at picocli.CommandLine.execute(CommandLine.java:2058)
    at genepi.mut.App.main(App.java:60)
  Picked up JAVA_TOOL_OPTIONS: -Xlog:disable -Xlog:all=warning:stderr
  htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: null, for input source: file://mt.vcf.gz
    at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
    at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
    at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
    at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:121)
    at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:81)
    at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:145)
    at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:95)
    at importer.VcfImporter.load(VcfImporter.java:22)
    at genepi.haplocheck.steps.ContaminationStep.detectContamination(ContaminationStep.java:67)
    at genepi.haplocheck.steps.ContaminationStep.run(ContaminationStep.java:36)
    at genepi.haplocheck.commands.ContaminationCommand.call(ContaminationCommand.java:55)
    at genepi.haplocheck.commands.ContaminationCommand.call(ContaminationCommand.java:1)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
    at picocli.CommandLine.access$1100(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159)
    at picocli.CommandLine.execute(CommandLine.java:2058)
    at genepi.haplocheck.App.main(App.java:32)
  Caused by: java.io.EOFException
    at java.base/java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:279)
    at java.base/java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:269)
    at java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:175)
    at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79)
    at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91)
    at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:257)
    ... 19 more
Work dir:
  /mnt/d/epi2melabs/instances/wf-human-variation_01HZE6ZKVD0XJV6TKKEVSMGB6W/work/b2/54ce3f6b8ea801524b4571ae2e7b34
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
 -- Check '/mnt/d/epi2melabs/instances/wf-human-variation_01HZE6ZKVD0XJV6TKKEVSMGB6W/nextflow.log' file for details
WARN: Killing running tasks (6)

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

SamStudio8 commented 3 weeks ago

Hi @nikelau - we're aware of some issues with this process in v2.2.1 - Please update the workflow to v2.2.2.

nikelau commented 3 weeks ago

Hi @nikelau - we're aware of some issues with this process in v2.2.1 - Please update the workflow to v2.2.2.

Thanks Sam! Will try now!