ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
511 stars 112 forks source link

cactus-graphmap-join failed at "make_giraffe_indexes" #780

Closed minglibio closed 2 years ago

minglibio commented 2 years ago

Hi,

I am trying to make a graphed pangenome by using cactus. When I use contig-level assemblies to run this pipeline, everything goes well. But when I scaffold these contig-level assemblies to chromosome-level and run the cactus pangenome pipeline using these chromosome-level assemblies, it always failed at cactus-graphmap-join step with the error as below :

[2022-09-04T12:52:26+0200] [MainThread] [I] [toil-rt] 2022-09-04 12:52:26.284732: Running the command: "bash -c set -eo pipefail && vg deconstruct /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.xg -P fAmpCit -a -r /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.snarls -g /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.gbwt -T /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.trans -t 1 | bgzip --threads 1"
[2022-09-04T12:53:11+0200] [Thread-1  ] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker make_giraffe_indexes file:/data/scc3/ming.li/project/03.MidasPangenome/04.GraphGenome/cactus/join/jobstore kind-make_giraffe_indexes/instance-u_mhjltu.
[2022-09-04T12:53:11+0200] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v1
Exit reason: None
[2022-09-04T12:53:11+0200] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v2
[2022-09-04T12:53:11+0200] [MainThread] [W] [toil.leader] Log from job "kind-make_giraffe_indexes/instance-u_mhjltu" follows:
=========>
    [2022-09-04T12:52:20+0200] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
    [2022-09-04T12:52:20+0200] [MainThread] [I] [toil] Running Toil version 5.6.0-c34146a6437e4407a61e946e968bcce67a0ebbca on host scc.
    [2022-09-04T12:52:20+0200] [MainThread] [I] [toil.worker] Working on job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v1
    [2022-09-04T12:52:21+0200] [MainThread] [I] [toil.worker] Loaded body Job('make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v1) from description 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v1
    [2022-09-04T12:52:21+0200] [MainThread] [I] [cactus.shared.common] Running the command ['vg', 'index', '-t', '1', '-j', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.dist', '-s', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.snarls', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.xg']
    [2022-09-04T12:52:21+0200] [MainThread] [I] [toil-rt] 2022-09-04 12:52:21.119418: Running the command: "vg index -t 1 -j /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.dist -s /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.snarls /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.xg"
    [2022-09-04T12:53:11+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files:
    [2022-09-04T12:53:11+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-61cfff0145c847b68c8c44c1ccc995ca/merged.xg' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.xg'
    [2022-09-04T12:53:11+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-3b6877ce2c6e43e2bfd29ff6e438d4e3/merged.gbwt' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.gbwt'
    [2022-09-04T12:53:11+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-ca7cd4e1175c4af9af5259c31fa509e5/merged.snarls' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.snarls'
    Traceback (most recent call last):
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/worker.py", line 405, in workerScript
        job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2399, in _runner
        returnValues = self._run(jobGraph=None, fileStore=fileStore)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2317, in _run
        return self.run(fileStore)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2540, in run
        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/refmap/cactus_graphmap_join.py", line 604, in make_giraffe_indexes
        cactus_call(parameters=['vg', 'index', '-t', str(job.cores), '-j', dist_path, '-s', snarls_path, xg_path])
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/shared/common.py", line 789, in cactus_call
        raise RuntimeError("{}Command {} exited {}: {}".format(sigill_msg, call, process.returncode, out))
    RuntimeError: Command ['vg', 'index', '-t', '1', '-j', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.dist', '-s', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.snarls', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/df5d/7e34/tmpwidddtme/Midas.xg'] exited 134: stdout=None, stderr=terminate called after throwing an instance of 'std::bad_alloc'
      what():  std::bad_alloc
    ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
    Stack trace path: /tmp/vg_crash_ZmUn5q/stacktrace.txt
    Please include the stack trace file in your bug report!

    [2022-09-04T12:53:11+0200] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host scc
<=========
[2022-09-04T12:53:12+0200] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v2 with ID kind-make_giraffe_indexes/instance-u_mhjltu to 1
[2022-09-04T12:53:12+0200] [MainThread] [I] [toil.leader] Issued job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v3 with job batch system ID: 58 and cores: 1, disk: 26.3 Gi, and memory: 37.3 Gi
[2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] Redirecting logging to /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/worker_log.txt
[2022-09-04T12:53:12+0200] [MainThread] [I] [toil-rt] 2022-09-04 12:53:12.724488: Running the command: "vg index -t 1 -j /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist -s /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg"
[2022-09-04T12:54:06+0200] [Thread-1  ] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker make_giraffe_indexes file:/data/scc3/ming.li/project/03.MidasPangenome/04.GraphGenome/cactus/join/jobstore kind-make_giraffe_indexes/instance-u_mhjltu.
[2022-09-04T12:54:06+0200] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v3
Exit reason: None
[2022-09-04T12:54:06+0200] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v5
[2022-09-04T12:54:06+0200] [MainThread] [W] [toil.leader] Log from job "kind-make_giraffe_indexes/instance-u_mhjltu" follows:
=========>
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil] Running Toil version 5.6.0-c34146a6437e4407a61e946e968bcce67a0ebbca on host scc.
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] Working on job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v4
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] Loaded body Job('make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v4) from description 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v4
    [2022-09-04T12:53:12+0200] [MainThread] [I] [cactus.shared.common] Running the command ['vg', 'index', '-t', '1', '-j', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist', '-s', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg']
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil-rt] 2022-09-04 12:53:12.724488: Running the command: "vg index -t 1 -j /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist -s /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg"
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files:
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-61cfff0145c847b68c8c44c1ccc995ca/merged.xg' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg'
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-3b6877ce2c6e43e2bfd29ff6e438d4e3/merged.gbwt' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.gbwt'
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-ca7cd4e1175c4af9af5259c31fa509e5/merged.snarls' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls'
    Traceback (most recent call last):
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/worker.py", line 405, in workerScript
        job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2399, in _runner
        returnValues = self._run(jobGraph=None, fileStore=fileStore)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2317, in _run
        return self.run(fileStore)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2540, in run
        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/refmap/cactus_graphmap_join.py", line 604, in make_giraffe_indexes
        cactus_call(parameters=['vg', 'index', '-t', str(job.cores), '-j', dist_path, '-s', snarls_path, xg_path])
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/shared/common.py", line 789, in cactus_call
        raise RuntimeError("{}Command {} exited {}: {}".format(sigill_msg, call, process.returncode, out))
    RuntimeError: Command ['vg', 'index', '-t', '1', '-j', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist', '-s', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg'] exited 134: stdout=None, stderr=terminate called after throwing an instance of 'std::bad_alloc'
      what():  std::bad_alloc
    ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
    Stack trace path: /tmp/vg_crash_5hFkdm/stacktrace.txt
    Please include the stack trace file in your bug report!

    [2022-09-04T12:54:06+0200] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host scc
<=========
[2022-09-04T12:54:06+0200] [MainThread] [W] [toil.job] Due to failure we are reducing the remaining try count of job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v5 with ID kind-make_giraffe_indexes/instance-u_mhjltu to 0
[2022-09-04T12:54:06+0200] [MainThread] [W] [toil.leader] Job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v6 is completely failed
[2022-09-04T13:42:16+0200] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2022-09-04T14:42:16+0200] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2022-09-04T15:42:17+0200] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2022-09-04T16:42:17+0200] [MainThread] [I] [toil.leader] 1 jobs are running, 0 jobs are issued and waiting to run
[2022-09-04T16:56:58+0200] [MainThread] [I] [toil-rt] 2022-09-04 16:56:58.747471: Successfully ran: "bash -c set -eo pipefail && vg deconstruct /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.xg -P fAmpCit -a -r /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.snarls -g /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.gbwt -T /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/Midas.trans -t 1 | bgzip --threads 1" in 14672.4593 seconds
[2022-09-04T16:56:58+0200] [MainThread] [I] [toil-rt] 2022-09-04 16:56:58.747955: Running the command: "tabix -p vcf /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/merged.vcf.gz"
[2022-09-04T16:58:14+0200] [MainThread] [I] [toil-rt] 2022-09-04 16:58:14.609595: Successfully ran: "tabix -p vcf /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ff22/fc7e/tmpx28r3a8f/merged.vcf.gz" in 75.7315 seconds
[2022-09-04T16:58:27+0200] [MainThread] [I] [toil.leader] Finished toil run with 5 failed jobs.
[2022-09-04T16:58:27+0200] [MainThread] [I] [toil.leader] Failed jobs at end of the run: 'graphmap_join_workflow' kind-graphmap_join_workflow/instance-aiyb359x v2 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v6 'Job' kind-join_vg/instance-ctqcroer v2 'Job' kind-Job/instance-t4d6h6rr v2 'vg_indexes' kind-vg_indexes/instance-3n02_0hl v3
[2022-09-04T16:58:27+0200] [MainThread] [I] [toil.realtimeLogger] Stopping real-time logging server.
[2022-09-04T16:58:27+0200] [MainThread] [I] [toil.realtimeLogger] Joining real-time logging server thread.
Traceback (most recent call last):
  File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/bin/cactus-graphmap-join", line 8, in <module>
    sys.exit(main())
  File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/refmap/cactus_graphmap_join.py", line 119, in main
    graphmap_join(options)
  File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/refmap/cactus_graphmap_join.py", line 181, in graphmap_join
    wf_output = toil.start(Job.wrapJobFn(graphmap_join_workflow, options, config, vg_ids, hal_ids, unclip_seq_id_map, bed_ids))
  File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/common.py", line 951, in start
    return self._runMainLoop(rootJobDescription)
  File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/common.py", line 1273, in _runMainLoop
    return Leader(config=self.config,
  File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/leader.py", line 289, in run
    raise FailedJobsException(self.jobStore, failed_jobs, exit_code=self.recommended_fail_exit_code)
toil.leader.FailedJobsException: The job store '/data/scc3/ming.li/project/03.MidasPangenome/04.GraphGenome/cactus/join/jobstore' contains 5 failed jobs: 'graphmap_join_workflow' kind-graphmap_join_workflow/instance-aiyb359x v2, 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v6, 'Job' kind-join_vg/instance-ctqcroer v2, 'Job' kind-Job/instance-t4d6h6rr v2, 'vg_indexes' kind-vg_indexes/instance-3n02_0hl v3
Log from job "'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v6" follows:
=========>
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil] Running Toil version 5.6.0-c34146a6437e4407a61e946e968bcce67a0ebbca on host scc.
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] Working on job 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v4
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil.worker] Loaded body Job('make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v4) from description 'make_giraffe_indexes' kind-make_giraffe_indexes/instance-u_mhjltu v4
    [2022-09-04T12:53:12+0200] [MainThread] [I] [cactus.shared.common] Running the command ['vg', 'index', '-t', '1', '-j', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist', '-s', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg']
    [2022-09-04T12:53:12+0200] [MainThread] [I] [toil-rt] 2022-09-04 12:53:12.724488: Running the command: "vg index -t 1 -j /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist -s /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls /localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg"
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files:
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-61cfff0145c847b68c8c44c1ccc995ca/merged.xg' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg'
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-3b6877ce2c6e43e2bfd29ff6e438d4e3/merged.gbwt' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.gbwt'
    [2022-09-04T12:54:06+0200] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-vg_indexes/instance-3n02_0hl/file-ca7cd4e1175c4af9af5259c31fa509e5/merged.snarls' to path '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls'
    Traceback (most recent call last):
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/worker.py", line 405, in workerScript
        job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2399, in _runner
        returnValues = self._run(jobGraph=None, fileStore=fileStore)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2317, in _run
        return self.run(fileStore)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/toil/job.py", line 2540, in run
        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/refmap/cactus_graphmap_join.py", line 604, in make_giraffe_indexes
        cactus_call(parameters=['vg', 'index', '-t', str(job.cores), '-j', dist_path, '-s', snarls_path, xg_path])
      File "/data/scc3/ming.li/software/cactus-bin-v2.2.0/cactus_env/lib/python3.9/site-packages/cactus/shared/common.py", line 789, in cactus_call
        raise RuntimeError("{}Command {} exited {}: {}".format(sigill_msg, call, process.returncode, out))
    RuntimeError: Command ['vg', 'index', '-t', '1', '-j', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.dist', '-s', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.snarls', '/localscratch/tmp/2075679.1.scc/a38de8dd2e3454df9954e2364cc75066/ef29/11ac/tmpave697r7/Midas.xg'] exited 134: stdout=None, stderr=terminate called after throwing an instance of 'std::bad_alloc'
      what():  std::bad_alloc
    ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
    Stack trace path: /tmp/vg_crash_5hFkdm/stacktrace.txt
    Please include the stack trace file in your bug report!

    [2022-09-04T12:54:06+0200] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host scc

I tried the 2.1.1 and 2.2.0 versions, all the same.

I was wondering if it might be a memory issue, but when I check the memory cost, it was far from the system limit. When I used the contig-level assemblies, the cactus-graphmap-join step used 55Gb memory. When I used the chromosome-level assemblies, I set different max memory (80,160,200Gb), but all failed.

The command is:

cactus-graphmap-join ./jobstore --outDir /path/to/outdir --outName Graph --reference RefAss --wlineSep "." --vcf --giraffe --gfaffix --vg /path/to/vgs/*.vg  --hal /path/to/hals/*.hal  --maxCores 16 --maxMemory 160G  --realTimeLogging --binariesMode local

Any suggestion?

Best, Ming

glennhickey commented 2 years ago

This is a crash while computing the "distance index" (vg index -j) for giraffe. This step uses by far the most memory of anything else in the pipeline, so running out of memory be my first guess as to the cause. But it could be a bug as well.

But in either case, it's a sign that your graph may be too complex to map to efficiently with giraffe. The first thing I'd try would be to clip out the unaligned regions by adding --clipLength 10000 --clipNonMinigraph to cactus-graphmap-join. These options are explained here: https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#hprc-graph-creating-the-whole-genome-graph

This is necessary, for example, to index the HPRC graph in under 256G memory.

minglibio commented 2 years ago

Thanks for your explanation! This issue is solved by adding --clipLength 10000 --clipNonMinigraph

minglibio commented 2 years ago

Hi @glennhickey

Following your suggestion, I got the final graph genome. But when I visualize a known SV, I got an unexpected result.

What I expected: image In the above image, the bottom one is the reference path. This SV is an insertion.

What I got now: image

All the difference I can know is that the above image is obtained with cactus v2.1.1 based on contig-level assemblies, the below image is obtained with cactus v2.2.0 version based on chromosome-level assemblies and added your suggested parameter in the cactus -join step.

Any suggestion?

Best, Ming

glennhickey commented 2 years ago

The pangenome pipeline is virtually identical between v2.2.0 and v2.1.1 so I'd be surprised if the difference you see is related to the change in version -- so probably due to the difference in input assemblies.

If I read this right, your SV insertion has moved to the right when using the chrom-based assemblies. And a new small insertion appears in its old location. But a deletion in the 5th from the top assembly disappears.

I agree that the newer scenario seems less parsimonious, but don't have a good explanation. Are there any changes in the flanking regions?

minglibio commented 2 years ago

@glennhickey

I finally figured it out. Not caused by cactus, but because when I used odgi to draw this graph, I added odgi sort --optimize.

I also found some differences between cactus v2.2.0 and v2.1.1 that can cause different alignment results. In v2.2.0, --pafInput option in cactus-align was discard. I found most reference regions are alignable by all the assemblies by using v2.1.1 with --pafInput option. But most reference regions are alignable by 1/4 or less of the assemblies by using v2.2.0 without --pafInput option.

In my case, this insertion exists in different individuals. The identity among the insertions of different individuals was greater than 99%, but they were identified as distinct alleles due to negligible base differences relative to their lengths. I have determined that the presence of this insertion is the key to phenotypic differences among individuals, but the presence of multiple insertion alleles will cause some trouble for my downstream analysis, such as GWAS. Is there any way for cactus to ignore small variations on these large structural variations?

glennhickey commented 2 years ago

Before v2.2.0, Cactus used lastz cigars for pairwise alignments. The --pafInput option told it to expect a PAF and convert it into lastz cigar. But in v2.2.0 Cactus was changed to use PAF all the time (no longer supporting lastz cigars at all), so the option was deprecated. This really should not affect any output (and we've run several tests to this effect). The other change is that v2.2.0 uses a slightly newer version of minigraph.

Anyway, I'd really like to figure out this dramatic difference you're seeing between versions. It is completely unexpected. Are you able to share any of the data? Or even any statistics about the output PAF out cactus-graphmap?

Your question about the insertion alleles is a good one, and touches on a fundamental limitation of the graph->VCF export process. Graphs elegantly store nested variants like SNPs inside insertions but this gets lost when going to VCF. Others have certainly written scripts to merge similar insertions. Perhaps @jmonlong as a link to one?

minglibio commented 2 years ago

Hi @glennhickey

I have tested different commands on a small dataset with different cactus versions. This time it seems all work well. I will let you know if I met this issue again.

Thanks for your explanation about the insertion alleles.

Best, Ming

minglibio commented 2 years ago

Hi @glennhickey

This issue arises again. I sent some data to your email. Hope that can help you to check this issue.

Best, Ming

glennhickey commented 2 years ago

I got the data. Thanks for taking the time to make a clear, reproducible case. I will take a look ASAP and get back to you.

glennhickey commented 2 years ago

OK, I am indeed seeing a huge difference. For this part, both logs are identical -- so I'm assuming the PAF got read in okay.

cactus_consolidated(Anc0): Starting annealing round with a minimum chain length of 64 and an alignment trim of 3
cactus_consolidated(Anc0): There were 619832 blocks in the sequence graph, representing 1354481101 total aligned bases
cactus_consolidated(Anc0): Block degree stats: min 1, avg 22.360178, median 26, max 36
cactus_consolidated(Anc0): Block support stats: min 0.000000, avg 0.072627, median 0.035714, max 0.500000
cactus_consolidated(Anc0): Starting annealing round with a minimum chain length of 64 and an alignment trim of 3

But then things get wildly different

2.1.1

cactus_consolidated(Anc0): A melting round is destroying 59500 blocks with an average degree of 20.493731 from chains with length less than 2. Total aligned bases lost: 1219377
cactus_consolidated(Anc0): Starting melting round with a minimum chain length of 4
cactus_consolidated(Anc0): A melting round is destroying 63188 blocks with an average degree of 21.329319 from chains with length less than 4. Total aligned bases lost: 3132698
cactus_consolidated(Anc0): Starting melting round with a minimum chain length of 8
cactus_consolidated(Anc0): A melting round is destroying 212835 blocks with an average degree of 20.883031 from chains with length less than 8. Total aligned bases lost: 28327262
cactus_consolidated(Anc0): A melting round is destroying 40864 blocks with an average degree of 17.956832 from chains with length less than 64. Total aligned bases lost: 8854850
cactus_consolidated(Anc0): Destroying 4385 recoverable blocks
cactus_consolidated(Anc0): The blocks covered 742067 columns for a total of 21533434 aligned bases
cactus_consolidated(Anc0): Destroying 177 recoverable blocks
cactus_consolidated(Anc0): The blocks covered 18374 columns for a total of 338582 aligned bases
cactus_consolidated(Anc0): Destroying 0 recoverable blocks
cactus_consolidated(Anc0): The blocks covered 0 columns for a total of 0 aligned bases
cactus_consolidated(Anc0): Ran cactus caf, 214 seconds have elapsed
cactus_consolidated(Anc0): Ran extended flowers ready for bar, 238 seconds have elapsed

2.2.1

cactus_consolidated(Anc0): Starting melting round with a minimum chain length of 2
cactus_consolidated(Anc0): A melting round is destroying 24172 blocks with an average degree of 11.199528 from chains with length less than 2. Total aligned bases lost: 270715
cactus_consolidated(Anc0): Starting melting round with a minimum chain length of 4
cactus_consolidated(Anc0): A melting round is destroying 22394 blocks with an average degree of 11.070689 from chains with length less than 4. Total aligned bases lost: 582236
cactus_consolidated(Anc0): Starting melting round with a minimum chain length of 8
cactus_consolidated(Anc0): A melting round is destroying 65923 blocks with an average degree of 9.571364 from chains with length less than 8. Total aligned bases lost: 3900490
cactus_consolidated(Anc0): A melting round is destroying 34723 blocks with an average degree of 9.519310 from chains with length less than 64. Total aligned bases lost: 2994868
cactus_consolidated(Anc0): Destroying 819 recoverable blocks
cactus_consolidated(Anc0): The blocks covered 193256 columns for a total of 1435759 aligned bases
cactus_consolidated(Anc0): Destroying 197 recoverable blocks
cactus_consolidated(Anc0): The blocks covered 63567 columns for a total of 291447 aligned bases
cactus_consolidated(Anc0): Destroying 0 recoverable blocks
cactus_consolidated(Anc0): The blocks covered 0 columns for a total of 0 aligned bases
cactus_consolidated(Anc0): Sequence graph statistics after melting:
cactus_consolidated(Anc0): There were 56506 blocks in the sequence graph, representing 67595807 total aligned bases
cactus_consolidated(Anc0): Block degree stats: min 1, avg 4.563409, median 1, max 19
cactus_consolidated(Anc0): Block support stats: min 0.000000, avg 0.077963, median 0.000000, max 0.500000
cactus_consolidated(Anc0): Pinch graph component with 669 nodes and 730 edges is being split up by breaking 7 edges to reduce size to less than 581 max, but found 4 pointless edges
cactus_consolidated(Anc0): Pinch graph component with 670 nodes and 761 edges is being split up by breaking 47 edges to reduce size to less than 581 max, but found 19 pointless edges
cactus_consolidated(Anc0): Pinch graph component with 2702 nodes and 3051 edges is being split up by breaking 60 edges to reduce size to less than 581 max, but found 17 pointless edges
cactus_consolidated(Anc0): Pinch graph component with 1050 nodes and 1193 edges is being split up by breaking 85 edges to reduce size to less than 581 max, but found 35 pointless edges
cactus_consolidated(Anc0): Ran cactus caf, 61 seconds have elapsed
cactus_consolidated(Anc0): Ran extended flowers ready for bar, 62 seconds have elapsed

And then after that, it gets worse.

2.1.1

cactus_consolidated(Anc0): Ran cactus bar (use poa:1), 367 seconds have elapsed
cactus_consolidated(Anc0): There are 10 layers in the flowers hierarchy
cactus_consolidated(Anc0): In the 0 layer there are 1 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 1 layer there are 119131 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 2 layer there are 99427 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 3 layer there are 38815 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 4 layer there are 11507 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 5 layer there are 2228 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 6 layer there are 383 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 7 layer there are 116 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 8 layer there are 15 flowers in the flowers hierarchy
cactus_consolidated(Anc0): In the 9 layer there are 1 flowers in the flowers hierarchy
cactus_consolidated(Anc0): Ran cactus make reference, 399 seconds have elapsed
cactus_consolidated(Anc0): Ran cactus make reference bottom up coordinates, 449 seconds have elapsed
cactus_consolidated(Anc0): Ran cactus make reference top down coordinates, 450 seconds have elapsed
cactus_consolidated(Anc0): Ran cactus to hal stage, 467 seconds have elapsed
cactus_consolidated(Anc0): Dumped sequences for hal file, 473 seconds have elapsed
cactus_consolidated(Anc0): Dumped reference sequences, 473 seconds have elapsed
cactus_consolidated(Anc0): Cactus consolidated is done!, 473 seconds have elapsed

but on 2.2.1 it's been stuck on bar for 30+ minutes using tons of memory and all cores.

The deltas are here. In terms of the config, there are 4 changes to CAF parameters, I think only the last 2 are suspicious. I will try reverting them and see what happens.

-blockTrim="2"
+blockTrim="5"

-maxRecoverableChainsIterations="5"
+maxRecoverableChainsIterations="10"

-alignmentFilter="filterSecondariesByMultipleSpecies"
+alignmentFilter="filterSecondariesByMultipleSequences"

-minimumBlockDegreeToCheckSupport="10"
+minimumBlockDegreeToCheckSupport="0"
glennhickey commented 2 years ago

@minglibio it should work again in v2.2.2

minglibio commented 2 years ago

Thanks! @glennhickey