epi2me-labs / wf-bacterial-genomes

Small variant calling for haploid samples
https://labs.epi2me.io/
Other
26 stars 8 forks source link

Error with Demo data #29

Open kiranpatil222 opened 8 months ago

kiranpatil222 commented 8 months ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v1.2.0

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-bacterial-genomes \
    --fastq wf-bacterial-genomes-demo/isolates_fastq \
    --isolates \
    --sample_sheet wf-bacterial-genomes-demo/isolates_sample_sheet.csv \
    -profile standard

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

ERROR ~ Error executing process > 'calling_pipeline:deNovo (2)'

Caused by:
  Process `calling_pipeline:deNovo (2)` terminated with an error exit status (1)

Command executed:

  COV_FAIL=0
  FLYE_EXIT_CODE=0
  flye    --nano-hq reads.fastq.gz --out-dir output --threads "3" ||     FLYE_EXIT_CODE=$?

  if [[ $FLYE_EXIT_CODE -eq 0 ]]; then
      mv output/assembly.fasta "./test1.draft_assembly.fasta"
      mv output/assembly_info.txt "./test1_flye_stats.tsv"
      bgzip "test1.draft_assembly.fasta"
  else
      # flye failed --> check the log to check why
      edge_cov=$(
          grep -oP 'Mean edge coverage: \K\d+' output/flye.log             || echo 5
      )
      ovlp_cov=$(
          grep -oP 'Overlap-based coverage: \K\d+' output/flye.log             || echo 5
      )
      if [[
          $edge_cov -lt 5 ||
          $ovlp_cov -lt 5
      ]]; then
          echo -n "Caught Flye failure due to low coverage (either mean edge cov. or "
          echo "overlap-based cov. were below 5)".
          COV_FAIL=1
      elif grep -q "No disjointigs were assembled" output/flye.log; then
          echo -n "Caught Flye failure due to disjointig assembly."
          COV_FAIL=2
      else
          # exit a subshell with error so that the process fails
          ( exit $FLYE_EXIT_CODE )
      fi
  fi

Command exit status:
  1

Command output:
  (empty)

Command error:
  [2024-03-08 11:32:29] INFO: Extending reads
  [2024-03-08 11:33:26] INFO: Overlap-based coverage: 32
  [2024-03-08 11:33:26] INFO: Median overlap divergence: 0.0971861
  0% 80% 90% 100%
  [2024-03-08 11:34:32] INFO: Assembled 2 disjointigs
  [2024-03-08 11:34:32] INFO: Generating sequence
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2024-03-08 11:34:35] INFO: Filtering contained disjointigs
  0% 50% 100%
  [2024-03-08 11:34:37] INFO: Contained seqs: 0
  [2024-03-08 11:34:37] INFO: >>>STAGE: consensus
  [2024-03-08 11:34:37] INFO: Running Minimap2
  [2024-03-08 11:35:11] INFO: Computing consensus
  Process SyncManager-1:
  Traceback (most recent call last):
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
      self.run()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/process.py", line 108, in run
      self._target(*self._args, **self._kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/managers.py", line 608, in _run_server
      server = cls._Server(registry, address, authkey, serializer)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/managers.py", line 154, in __init__
      self.listener = Listener(address=address, backlog=16)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 448, in __init__
      self._listener = SocketListener(address, family, backlog)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 591, in __init__
      self._socket.bind(address)
  OSError: AF_UNIX path too long
  Traceback (most recent call last):
    File "/home/epi2melabs/conda/bin/flye", line 33, in <module>
      sys.exit(load_entry_point('flye==2.9.3', 'console_scripts', 'flye')())
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/main.py", line 756, in main
      _run(args)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/main.py", line 493, in _run
      jobs[i].run()
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/main.py", line 284, in run
      consensus_fasta = cons.get_consensus(out_alignment, self.in_contigs,
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/polishing/consensus.py", line 71, in get_consensus
      mp_manager = multiprocessing.Manager()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/context.py", line 57, in Manager
      m.start()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/managers.py", line 583, in start
      self._address = reader.recv()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 250, in recv
      buf = self._recv_bytes()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
      buf = self._recv(4)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
      raise EOFError
  EOFError

Relevant log output

ERROR ~ Error executing process > 'calling_pipeline:deNovo (2)'

Caused by:
  Process `calling_pipeline:deNovo (2)` terminated with an error exit status (1)

Command executed:

  COV_FAIL=0
  FLYE_EXIT_CODE=0
  flye    --nano-hq reads.fastq.gz --out-dir output --threads "3" ||     FLYE_EXIT_CODE=$?

  if [[ $FLYE_EXIT_CODE -eq 0 ]]; then
      mv output/assembly.fasta "./test1.draft_assembly.fasta"
      mv output/assembly_info.txt "./test1_flye_stats.tsv"
      bgzip "test1.draft_assembly.fasta"
  else
      # flye failed --> check the log to check why
      edge_cov=$(
          grep -oP 'Mean edge coverage: \K\d+' output/flye.log             || echo 5
      )
      ovlp_cov=$(
          grep -oP 'Overlap-based coverage: \K\d+' output/flye.log             || echo 5
      )
      if [[
          $edge_cov -lt 5 ||
          $ovlp_cov -lt 5
      ]]; then
          echo -n "Caught Flye failure due to low coverage (either mean edge cov. or "
          echo "overlap-based cov. were below 5)".
          COV_FAIL=1
      elif grep -q "No disjointigs were assembled" output/flye.log; then
          echo -n "Caught Flye failure due to disjointig assembly."
          COV_FAIL=2
      else
          # exit a subshell with error so that the process fails
          ( exit $FLYE_EXIT_CODE )
      fi
  fi

Command exit status:
  1

Command output:
  (empty)

Command error:
  [2024-03-08 11:32:29] INFO: Extending reads
  [2024-03-08 11:33:26] INFO: Overlap-based coverage: 32
  [2024-03-08 11:33:26] INFO: Median overlap divergence: 0.0971861
  0% 80% 90% 100%
  [2024-03-08 11:34:32] INFO: Assembled 2 disjointigs
  [2024-03-08 11:34:32] INFO: Generating sequence
  0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  [2024-03-08 11:34:35] INFO: Filtering contained disjointigs
  0% 50% 100%
  [2024-03-08 11:34:37] INFO: Contained seqs: 0
  [2024-03-08 11:34:37] INFO: >>>STAGE: consensus
  [2024-03-08 11:34:37] INFO: Running Minimap2
  [2024-03-08 11:35:11] INFO: Computing consensus
  Process SyncManager-1:
  Traceback (most recent call last):
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
      self.run()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/process.py", line 108, in run
      self._target(*self._args, **self._kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/managers.py", line 608, in _run_server
      server = cls._Server(registry, address, authkey, serializer)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/managers.py", line 154, in __init__
      self.listener = Listener(address=address, backlog=16)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 448, in __init__
      self._listener = SocketListener(address, family, backlog)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 591, in __init__
      self._socket.bind(address)
  OSError: AF_UNIX path too long
  Traceback (most recent call last):
    File "/home/epi2melabs/conda/bin/flye", line 33, in <module>
      sys.exit(load_entry_point('flye==2.9.3', 'console_scripts', 'flye')())
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/main.py", line 756, in main
      _run(args)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/main.py", line 493, in _run
      jobs[i].run()
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/main.py", line 284, in run
      consensus_fasta = cons.get_consensus(out_alignment, self.in_contigs,
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/flye/polishing/consensus.py", line 71, in get_consensus
      mp_manager = multiprocessing.Manager()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/context.py", line 57, in Manager
      m.start()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/managers.py", line 583, in start
      self._address = reader.recv()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 250, in recv
      buf = self._recv_bytes()
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
      buf = self._recv(4)
    File "/home/epi2melabs/conda/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
      raise EOFError
  EOFError

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

no

Other demo data information

No response

kiranpatil222 commented 8 months ago

My path isnt Long path: /data/Kiran/ONT_workFlows/wf_Bacteria

cjalder commented 8 months ago

Thanks for sending this over. Would it be possible to send over the nextflow.log? I'm hoping it will shed some light on the OSError: AF_UNIX path too long error coming from socket, which should be using $TMPDIR, but may be configured differently on your machine.

Additionally is this being run on an ONT device, and if so is it using the stock nextflow provided?

kiranpatil222 commented 8 months ago

No this is not run on ONT device its run on other server. Also, where can i find the log, i do not see in the nextflow/assets?

cjalder commented 8 months ago

It should be within the directory you ran with the workflow in. Sorry I should note it is a hidden file .nextflow.log so you should be able to see it with ls -al

kiranpatil222 commented 8 months ago

.nextflow.log

cjalder commented 8 months ago

This appears to be a log where the fastq's couldn't be found. Do you have the one where the denovo assembly failed? There's may be multiple logs in the directory, if nextflow was run multiple times within that directory e.g .nextflow.log.1, .nextflow.log.2.... You could try grep 'AF_UNIX path too long' .nextflow.log* to find the right one.

kiranpatil222 commented 8 months ago

.nextflow.log

cjalder commented 8 months ago

Thanks - could you have a look at your $TMPDIR path, you should be able to get it with echo $TMPDIR. Is it particularly long?

kiranpatil222 commented 8 months ago

It doesnt show anything, but blank with $TMPDIR