Missing output file(s) `SAMPLE.haplocheck.tsv` expected by process `haplocheck (1)` #188

Open nikelau opened 3 weeks ago

nikelau commented 3 weeks ago

Operating System

Windows 11

Other Linux

Workflow Version


Workflow Execution

EPI2ME Desktop (Local)

Other workflow execution

EPI2ME Version

EPI2ME V5.1.10

CLI command run

Workflow Execution - CLI Execution Profile


What happened?

I guess there is a missing file "SAMPLE.haplocheck.tsv" at the step "publish_artifact"? Any insights would be much appreciated. Thanks!

Relevant log output

This is epi2me-labs/wf-human-variation v2.2.1.
Searching input for [.bam, .ubam] files.
[ed/1a0c8d] Submitted process > getParams
[bd/1ac47a] Submitted process > str:getVersions
[2c/13fac7] Submitted process > index_ref_fai (1)
[34/2714d3] Submitted process > report_snp:getVersions
[e2/b3b38b] Submitted process > cnv_spectre:getParams
[2e/ea53c2] Submitted process > cnv_spectre:getVersions
[b8/ef62e1] Submitted process > sv:runReport:getVersions
[e2/b5f309] Submitted process > report_snp:getParams
[5b/a73bfd] Submitted process > str:getParams
[ea/ff432a] Submitted process > sv:runReport:getParams
[22/4b2c5f] Submitted process > getVersions
[44/ccbf3f] Submitted process > cram_cache (1)
[02/7a163f] Submitted process > ingress:checkBamHeaders (1)
[71/dbe98e] Submitted process > cnv_spectre:add_snp_tools_to_versions
[0e/d7c53a] Submitted process > ingress:catSortBams (1)
[71/d7d392] Submitted process > getAllChromosomesBed (1)
[cf/fbb50a] Submitted process > ingress:check_for_alignment (1)
[ff/79a6a0] Submitted process > ingress:minimap2_alignment (1)
[b1/59b824] Submitted process > validate_modbam (1)
[a7/32f483] Submitted process > lookup_clair3_model (1)
[87/ee4640] Submitted process > readStats (1)
[b2/54ce3f] Submitted process > haplocheck (1)
[0b/ad716e] Submitted process > mosdepth_input (1)
[bd/fbd01b] Submitted process > getGenome (1)
[cd/a21486] Submitted process > cnv_spectre:mosdepth (1)
[a1/9e42df] Submitted process > configure_jbrowse (1)
Autoselected Clair3 model: r941_prom_sup_g5014
[47/061157] Submitted process > snp:make_chunks (1)
[6f/fc74cf] Submitted process > publish_artifact (1)
ERROR ~ Error executing process > 'haplocheck (1)'
Caused by:
  Missing output file(s) `SAMPLE.haplocheck.tsv` expected by process `haplocheck (1)`
Command executed:
  # Extract mito-genome reads first
  samtools view -@1 -hb SAMPLE.cram chrM MT Mt > SAMPLE
  samtools index -@2 SAMPLE

  # Get MT sequence IDs. This is needed because, providing inexisting
  # sequence IDs to `samtools faidx` causes to create new empty sequences
  # in the output FASTA file.
  samtools view SAMPLE | cut -f 3 | sort | uniq > seqs.txt
  has_mt=$( wc -l seqs.txt | awk '{print $1}' )

  # Run the commands if there are reads to process
  if [ $has_mt -gt 0 ]; then
      # Extract regions from the fasta file
      samtools faidx -r seqs.txt hg38.analysisSet.fa > mt.fa && samtools faidx mt.fa

      # Run mutserve
      java -jar `which mutserve.jar` call                 --level 0.01                 --reference mt.fa                 --mapQ 20                 --baseQ 20                 --output mt.vcf.gz                 --no-ansi                 --threads 2                 SAMPLE

      # Run haplocheck
      haplocheck --out SAMPLE.haplocheck.tsv mt.vcf.gz
  # If no MT is found, then save it as NV (no value) as opposed to ND (not determined)
      echo "Sample  Contamination Status    Contamination Level Distance    Sample Coverage" > SAMPLE.haplocheck.tsv
      echo "SAMPLE  NO  NV  0   0" >> SAMPLE.haplocheck.tsv
Command exit status:
Command output:

  mtDNA Variant Detection v2.0.0-rc12
  (c) Sebastian Schoenherr, Hansi Weissensteiner, Lukas Forer
  [call, --level, 0.01, --reference, mt.fa, --mapQ, 20, --baseQ, 20, --output, mt.vcf.gz, --no-ansi, --threads, 2, SAMPLE]
  [Run]     SAMPLE...
  [Done]    SAMPLE. Execution Time: 00:00:00
  [Run]     Merge output files...
  [Done]    Merge output files. Execution Time: 00:00:00

  haplocheck 1.3.3
  (c) 2020 Sebastian Schoenherr, Hansi Weissensteiner, Lukas Forer

  Check for Contamination.. 
  RUN Load file...
  ERROR Contamination failed
Command error:
  [main_samview] region "MT" specifies an invalid region or unknown reference. Continue anyway.
  [main_samview] region "Mt" specifies an invalid region or unknown reference. Continue anyway.
  Picked up JAVA_TOOL_OPTIONS: -Xlog:disable -Xlog:all=warning:stderr
  java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: A
    at htsjdk.variant.variantcontext.VariantContext.makeAlleles(VariantContext.java:1536)
    at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:468)
    at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:647)
    at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:638)
    at genepi.mut.pileup.VcfWriter.createVCF(VcfWriter.java:213)
    at genepi.mut.commands.VariantCallingCommand.call(VariantCallingCommand.java:212)
    at genepi.mut.commands.VariantCallingCommand.call(VariantCallingCommand.java:28)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
    at picocli.CommandLine.access$1100(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159)
    at picocli.CommandLine.execute(CommandLine.java:2058)
    at genepi.mut.App.main(App.java:60)
  Picked up JAVA_TOOL_OPTIONS: -Xlog:disable -Xlog:all=warning:stderr
  htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: null, for input source: file://mt.vcf.gz
    at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
    at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
    at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
    at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:121)
    at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:81)
    at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:145)
    at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:95)
    at importer.VcfImporter.load(VcfImporter.java:22)
    at genepi.haplocheck.steps.ContaminationStep.detectContamination(ContaminationStep.java:67)
    at genepi.haplocheck.steps.ContaminationStep.run(ContaminationStep.java:36)
    at genepi.haplocheck.commands.ContaminationCommand.call(ContaminationCommand.java:55)
    at genepi.haplocheck.commands.ContaminationCommand.call(ContaminationCommand.java:1)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
    at picocli.CommandLine.access$1100(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159)
    at picocli.CommandLine.execute(CommandLine.java:2058)
    at genepi.haplocheck.App.main(App.java:32)
  Caused by: java.io.EOFException
    at java.base/java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:279)
    at java.base/java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:269)
    at java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:175)
    at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79)
    at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91)
    at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:257)
    ... 19 more
Work dir:
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
 -- Check '/mnt/d/epi2melabs/instances/wf-human-variation_01HZE6ZKVD0XJV6TKKEVSMGB6W/nextflow.log' file for details
WARN: Killing running tasks (6)

Application activity log entry

Were you able to successfully run the latest version of the workflow with the demo data?


Other demo data information

SamStudio8 commented 3 weeks ago

Hi @nikelau - we're aware of some issues with this process in v2.2.1 - Please update the workflow to v2.2.2.

nikelau commented 3 weeks ago

Hi @nikelau - we're aware of some issues with this process in v2.2.1 - Please update the workflow to v2.2.2.

Thanks Sam! Will try now!