Closed peanut-nu closed 2 weeks ago
It looks like it is running out of memory and getting killed
bash: line 1: 8850 Killed minigraph /work/pan_cactus ...
you can try resuming by rerunning the same command with --restart
. If that doesn't work you can try again from scratch but with more --mapCores
. That would schedule fewer mapping jobs in parallel with less total memory usage.
It looks like it is running out of memory and getting killed
bash: line 1: 8850 Killed minigraph /work/pan_cactus ...
you can try resuming by rerunning the same command with
--restart
. If that doesn't work you can try again from scratch but with more--mapCores
. That would schedule fewer mapping jobs in parallel with less total memory usage.
Thank you so much ,that's what I figured.But i change the --mapCores to 60 ,and by rerunning the same command with --restart,it doesn't work.if i do it again ,the --mapCores option can be increased to whatever size to run?And the order contains 12 species, each with 12 chromosomes.
To change --mapCores
you have to rerun from scratch (ie without --restart
)
To change
--mapCores
you have to rerun from scratch (ie without--restart
)
get it ,thank you so much
Hello, I am testing the Minigraph-cactus pipeline. The program reported an erro, but it is still running.
the order like this:nohup cactus-pangenome ./jobstore genome.txt --outDir pangenome --workDir
pwd
/tmp --outName pangenome --chop --haplo --permissiveContigFilter --gbz full clip filter --gfa full clip filter --vcf full clip filter --giraffe full clip filter --chrom-vg full clip filter --viz full clip --odgi full clip filter --chrom-og full clip filter --binariesMode local --filter 2 --mgCores 60 --mapCores 8 --consCores 32 --indexCores 32 --refContigs HiC_scaffold_1 HiC_scaffold_2 HiC_scaffold_3 HiC_scaffold_4 HiC_scaffold_5 HiC_scaffold_6 HiC_scaffold_7 HiC_scaffold_8 HiC_scaffold_9 HiC_scaffold_10 HiC_scaffold_11 HiC_scaffold_12 --reference che --logFile cactus-pangenome.logthe erro like this : Due to failure we are reducing the remaining try count of job 'minigraph_map_one' kind-minigraph_map_one/instance-yo5g871q v8 with ID kind-minigraph_map_one/instance-yo5g871q to 1 Issued job 'minigraph_map_one' kind-minigraph_map_one/instance-yo5g871q v9 with job batch system ID: 19 and disk: 7.0 Gi, memory: 50.0 Gi, cores: 8, accelerators: [], preemptible: False 2024-05-31 16:08:25.898995: Running the command: "bash -c set -eo pipefail && minigraph /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/76b4/94bb/tmpurhmx_ba/mg.gfa /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/76b4/94bb/tmpurhmx_ba/oxy.fa -o /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/76b4/94bb/tmpurhmx_ba/oxy.gaf -c -xasm -t 8" Got exit code 1 (indicating failure) from job _toil_worker minigraph_map_one file:/work/pan_cactus/jobstore kind-minigraph_map_one/instance-v1qj8dsk. Job failed with exit value 1: 'minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v9 Exit reason: None The job seems to have left a log file, indicating failure: 'minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v11 Log from job "kind-minigraph_map_one/instance-v1qj8dsk" follows: =========> [2024-05-31T16:04:58+0800] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG--- [2024-05-31T16:04:58+0800] [MainThread] [I] [toil] Running Toil version 5.12.0-6d5a5b83b649cd8adf34a5cfe89e7690c95189d3 on host e06a0e75afaf. [2024-05-31T16:04:58+0800] [MainThread] [I] [toil.worker] Working on job 'minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v10 [2024-05-31T16:04:58+0800] [MainThread] [I] [toil.worker] Loaded body Job('minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v10) from description 'minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v10 [2024-05-31T16:04:58+0800] [MainThread] [I] [cactus.shared.common] Running the command ['bash', '-c', 'set -eo pipefail && minigraph /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/mg.gfa /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.fa -o /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.gaf -c -xasm -t 8'] [2024-05-31T16:04:58+0800] [MainThread] [I] [toil-rt] 2024-05-31 16:04:58.746277: Running the command: "bash -c set -eo pipefail && minigraph /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/mg.gfa /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.fa -o /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.gaf -c -xasm -t 8" [2024-05-31T16:09:45+0800] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files: [2024-05-31T16:09:45+0800] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-unzip_gz/instance-wb6e6feh/file-ac206323ea744b1a8a6c07f7e57f9ead/pangenome.sv.gfa' to path '/work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/mg.gfa' [2024-05-31T16:09:45+0800] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/for-job/kind-sanitize_fasta_header/instance-kpfy1sly/file-51b49c5254f34d258c6767246125afa2/ilex.sanitized.fa' to path '/work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.fa' Traceback (most recent call last): File "/share/work/biosoft/python/Python-v3.9.16/lib/python3.9/site-packages/toil/worker.py", line 403, in workerScript job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer) File "/share/work/biosoft/python/Python-v3.9.16/lib/python3.9/site-packages/toil/job.py", line 2774, in _runner returnValues = self._run(jobGraph=None, fileStore=fileStore) File "/share/work/biosoft/python/Python-v3.9.16/lib/python3.9/site-packages/toil/job.py", line 2691, in _run return self.run(fileStore) File "/share/work/biosoft/python/Python-v3.9.16/lib/python3.9/site-packages/toil/job.py", line 2919, in run rValue = userFunction(*((self,) + tuple(self._args)), *self._kwargs) File "/share/work/biosoft/python/Python-v3.9.16/lib/python3.9/site-packages/cactus/refmap/cactus_graphmap.py", line 366, in minigraph_map_one cactus_call(parameters=cmd, job_memory=job.memory) File "/share/work/biosoft/python/Python-v3.9.16/lib/python3.9/site-packages/cactus/shared/common.py", line 889, in cactus_call raise RuntimeError("{}Command {} exited {}: {}".format(sigill_msg, call, process.returncode, out)) RuntimeError: Command ['bash', '-c', 'set -eo pipefail && minigraph /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/mg.gfa /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.fa -o /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.gaf -c -xasm -t 8'] exited 137: stderr=[M::main::21.4080.97] loaded the graph from "/work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/mg.gfa" [M::mg_index::118.8661.43] indexed the graph [M::mg_opt_update::128.1021.40] occ_max1=100; lc_max_occ=3 bash: line 1: 8850 Killed minigraph /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/mg.gfa /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.fa -o /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/ff78/6f03/tmpcpvukgxf/ilex.gaf -c -xasm -t 8
[2024-05-31T16:09:45+0800] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host e06a0e75afaf <========= Due to failure we are reducing the remaining try count of job 'minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v11 with ID kind-minigraph_map_one/instance-v1qj8dsk to 0 Job 'minigraph_map_one' kind-minigraph_map_one/instance-v1qj8dsk v12 is completely failed 2024-05-31 16:09:47.181848: Running the command: "bash -c set -eo pipefail && minigraph /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/7fd6/0e55/tmpyf80w66o/mg.gfa /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/7fd6/0e55/tmpyf80w66o/Lip.fa -o /work/pan_cactus/tmp/8125de4c93825d1f950e4ddaf5dd5004/7fd6/0e55/tmpyf80w66o/Lip.gaf -c -xasm -t 8"
can you help me?thank you so much!