Closed callumparr closed 9 months ago
I ran:
xpore diffmod --config /staging/biology/andreachi77/xPore/configuration.ymlfile/DMSO2_STM2uM_config.yml --n_processes 14 --save_models
and I have similar errors:
Traceback (most recent call last): File "/opt/ohpc/Taiwania3/pkg/biology/xpore/xpore_v2.1/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/opt/ohpc/Taiwania3/pkg/biology/xpore/xpore_v2.1/lib/python3.12/site-packages/xpore/scripts/helper.py", line 113, in run result = self.task_function(*next_task_args,self.locks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/ohpc/Taiwania3/pkg/biology/xpore/xpore_v2.1/lib/python3.12/site-packages/xpore/scripts/diffmod.py", line 63, in execute io.save_models_to_hdf5(models,out_paths['model_filepath']) File "/opt/ohpc/Taiwania3/pkg/biology/xpore/xpore_v2.1/lib/python3.12/site-packages/xpore/diffmod/io.py", line 110, in save_models_to_hdf5 model_file = h5py.File(model_filepath, 'w') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/ohpc/Taiwania3/pkg/biology/xpore/xpore_v2.1/lib/python3.12/site-packages/h5py/_hl/files.py", line 562, in __init__ fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/ohpc/Taiwania3/pkg/biology/xpore/xpore_v2.1/lib/python3.12/site-packages/h5py/_hl/files.py", line 241, in make_fid fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 122, in h5py.h5f.create BlockingIOError: [Errno 11] Unable to synchronously create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
Anyone can give me some tips?
Thanks!
Andrea
Shows there are some 32,000 transcript IDs to process. There is an error after a few but then continues. Up to 32,000 process transcripts it just stops doing anything and no additional error messages are written out to qsub log.
I run following command and submitted to cluster using
qsub
on abigmem.q
:#!/bin/bash # Set source of conda install source miniconda3/etc/profile.d/conda.sh conda activate xpore export baseDir=/analysisdata/rawseq/bcl/callum xpore-diffmod --config xpore_diffmod_output/Male_SPF_young_vs_old_config.yml --save_models --resume
I initially started the run interactively to check it was all working and then resumed it by qsub.
Shows there are some 32,000 transcript IDs to process. There is an error after a few but then continues. Up to 32,000 process transcripts it just stops doing anything and no additional error messages are written out to qsub log.
I assume it finished, if I submit the job again with
--resume
this time the job finishes and no additional errors are written out.File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/diffmod.py", line 83, in execute io.save_models_to_hdf5(models,out_paths['model_filepath']) File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/diffmod/io.py", line 110, in save_models_to_hdf5 model_file = h5py.File(model_filepath, 'w') File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/h5py/_hl/files.py", line 424, in __init__ fid = make_fid(name, mode, userblock_size, File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/h5py/_hl/files.py", line 196, in make_fid fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 116, in h5py.h5f.create OSError: Unable to create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable') Process Consumer-11: Traceback (most recent call last): File "/home/callum/miniconda3/envs/xpore/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/helper.py", line 110, in run result = self.task_function(*next_task_args,self.locks) File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/diffmod.py", line 83, in execute io.save_models_to_hdf5(models,out_paths['model_filepath']) File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/diffmod/io.py", line 110, in save_models_to_hdf5 model_file = h5py.File(model_filepath, 'w') File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/h5py/_hl/files.py", line 424, in __init__ fid = make_fid(name, mode, userblock_size, File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/h5py/_hl/files.py", line 196, in make_fid fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 116, in h5py.h5f.create OSError: Unable to create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable') Process Consumer-6: Traceback (most recent call last): File "/home/callum/miniconda3/envs/xpore/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/helper.py", line 110, in run result = self.task_function(*next_task_args,self.locks) File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/scripts/diffmod.py", line 83, in execute io.save_models_to_hdf5(models,out_paths['model_filepath']) File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/xpore/diffmod/io.py", line 110, in save_models_to_hdf5 model_file = h5py.File(model_filepath, 'w') File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/h5py/_hl/files.py", line 424, in __init__ fid = make_fid(name, mode, userblock_size, File "/home/callum/miniconda3/envs/xpore/lib/python3.8/site-packages/h5py/_hl/files.py", line 196, in make_fid fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 116, in h5py.h5f.create OSError: Unable to create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
I run following command and submitted to cluster using
qsub
on abigmem.q
:I initially started the run interactively to check it was all working and then resumed it by qsub.
Shows there are some 32,000 transcript IDs to process. There is an error after a few but then continues. Up to 32,000 process transcripts it just stops doing anything and no additional error messages are written out to qsub log.
I assume it finished, if I submit the job again with
--resume
this time the job finishes and no additional errors are written out.