Closed polchan closed 1 year ago
Don't use --binariesMode singularity
.
Don't use
--binariesMode singularity
.
I tried your opinion and used new code singularity exec ~/singularity_image/cactus_v2_6_12.sif bash -c "cactus ./js ./Malus.txt ./Malus.hal --gpu 4
to run, FileNotFoundError: [Errno 2] No such file or directory: 'singularity'
didn't show up. But it remind me as below:
Traceback (most recent call last): File "/home/cactus/cactus_env/bin/cactus", line 8, in
sys.exit(main()) File "/home/cactus/cactus_env/lib/python3.8/site-packages/cactus/progressive/cactus_progressive.py", line 436, in main hal_id = toil.start(Job.wrapJobFn(progressive_workflow, options, config_node, mc_tree, og_map, input_seq_id_map)) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/common.py", line 1064, in start return self._runMainLoop(rootJobDescription) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/common.py", line 1539, in _runMainLoop return Leader(config=self.config, File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 251, in run self.innerLoop() File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 741, in innerLoop self._processReadyJobs() File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 636, in _processReadyJobs self._processReadyJob(message.job_id, message.result_status) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 552, in _processReadyJob self._runJobSuccessors(job_id) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 442, in _runJobSuccessors self.issueJobs(successors) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 919, in issueJobs self.issueJob(job) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 896, in issueJob jobBatchSystemID = self.batchSystem.issueBatchJob(jobNode, job_environment=job_environment) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 755, in issueBatchJob self.check_resource_request(scaled_desc) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 506, in check_resource_request raise e File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 502, in check_resource_request super().check_resource_request(requirer) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/abstractBatchSystem.py", line 344, in check_resource_request raise e File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/abstractBatchSystem.py", line 339, in check_resource_request self._check_accelerator_request(requirer) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 512, in _check_accelerator_request raise InsufficientSystemResources(requirer, 'accelerators', self.accelerator_identities, details=[ toil.batchSystems.abstractBatchSystem.InsufficientSystemResources: The job 'LastzRepeatMaskJob' kind-LastzRepeatMaskJob/instance-eos2e9uz v1 is requesting [{'count': 4, 'kind': 'gpu', 'api': 'cuda', 'brand': 'nvidia'}] accelerators, more than the maximum of [] accelerators that SingleMachineBatchSystem was configured with. The accelerator {'count': 4, 'kind': 'gpu', 'api': 'cuda', 'brand': 'nvidia'} could not be provided. Scale is set to 1.0.
Next, I read the help document, I think I may need to add the model of the Accelerator. I used new code singularity exec ~/singularity_image/cactus_v2_6_12.sif bash -c "cactus ./js ./Malus.txt ./Malus.hal --gpu 4 --defaultAccelerators nvidia-A800"
, but it still remind me as below:
Traceback (most recent call last): File "/home/cactus/cactus_env/bin/cactus", line 8, in
sys.exit(main()) File "/home/cactus/cactus_env/lib/python3.8/site-packages/cactus/progressive/cactus_progressive.py", line 436, in main hal_id = toil.start(Job.wrapJobFn(progressive_workflow, options, config_node, mc_tree, og_map, input_seq_id_map)) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/common.py", line 1064, in start return self._runMainLoop(rootJobDescription) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/common.py", line 1539, in _runMainLoop return Leader(config=self.config, File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 251, in run self.innerLoop() File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 741, in innerLoop self._processReadyJobs() File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 636, in _processReadyJobs self._processReadyJob(message.job_id, message.result_status) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 532, in _processReadyJob self.issueJob(readyJob) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/leader.py", line 896, in issueJob jobBatchSystemID = self.batchSystem.issueBatchJob(jobNode, job_environment=job_environment) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 755, in issueBatchJob self.check_resource_request(scaled_desc) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 506, in check_resource_request raise e File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 502, in check_resource_request super().check_resource_request(requirer) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/abstractBatchSystem.py", line 344, in check_resource_request raise e File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/abstractBatchSystem.py", line 339, in check_resource_request self._check_accelerator_request(requirer) File "/home/cactus/cactus_env/lib/python3.8/site-packages/toil/batchSystems/singleMachine.py", line 512, in _check_accelerator_request raise InsufficientSystemResources(requirer, 'accelerators', self.accelerator_identities, details=[ toil.batchSystems.abstractBatchSystem.InsufficientSystemResources: The job 'progressive_workflow' kind-progressive_workflow/instance-towjjxw8 v1 is requesting [{'count': 1, 'kind': 'gpu', 'model': 'nvidia-A800', 'brand': 'nvidia'}] accelerators, more than the maximum of [] accelerators that SingleMachineBatchSystem was configured with. The accelerator {'count': 1, 'kind': 'gpu', 'model': 'nvidia-A800', 'brand': 'nvidia'} could not be provided. Scale is set to 1.0.
Sorry, I think this is my mistake in understanding the defaultAccelerators
parameter, can you tell me how to set this parameter rightly?
best-wish
Bo-Cheng
If you want to use your gpus inside Singularity, you need to use singularly exec
's --nv
option. Otherwise they will be invisible to Cactus and you will see this error.
singularity exec --nv ~/singularity_image/cactus_v2_6_12.sif bash -c "cactus ./js ./Malus.txt ./Malus.hal --gpu"
Hello, Glenn
I have used singularity running the lastest Image release:
cactus:v2.6.12-gpu
, and I used this codesingularity exec ~/singularity_image/cactus_v2_6_12.sif bash -c "cactus ./js ./Malus.txt ./Malus.hal --binariesMode singularity --gpu 8"
, I think the #1213 issue has fixed, but one new error interrupted running, this error message as below:So, Can you help me solve this issue? Thanks best wishs
Bo-Cheng