griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
131 stars 58 forks source link

OSError: [Errno 9] Bad file descriptor #1080

Closed ahkamzifo closed 1 month ago

ahkamzifo commented 4 months ago

Facing almost similar issue #629 when running pvacseq using docker via snakemake as part of the pipeline.

pVACtools Version / Docker Image 4.1.0 - running via Singularity (v3.8.6)

Operating System NAME="Ubuntu" VERSION="22.04.3 LTS (Jammy Jellyfish)"

While running "pvacseq run" command, my run is getting interpreted after a while and error log for one particular sample is shown below: Also have attached the log file. ( SRR8281218.neoantigen_pvacseq.log

) Traceback (most recent call last): File "/opt/iedb/mhc_i/src/predict_binding.py", line 516, in <module> Prediction().main() File "/opt/iedb/mhc_i/src/predict_binding.py", line 494, in main stdin = sys.stdin.readline().strip() OSError: [Errno 9] Bad file descriptor Traceback (most recent call last): File "/opt/iedb/mhc_i/src/predict_binding.py", line 516, in <module> Prediction().main() File "/opt/iedb/mhc_i/src/predict_binding.py", line 494, in main stdin = sys.stdin.readline().strip() OSError: [Errno 9] Bad file descriptor Traceback (most recent call last): File "/opt/iedb/mhc_i/src/predict_binding.py", line 516, in <module> Prediction().main() File "/opt/iedb/mhc_i/src/predict_binding.py", line 494, in main stdin = sys.stdin.readline().strip() OSError: [Errno 9] Bad file descriptor An exception occured in thread 5: (<class 'subprocess.CalledProcessError'>, Command '['/usr/local/bin/python', '/opt/iedb/mhc_i/src/predict_binding.py', 'netmhccons', 'HLA-C*04:01', '8', 'analysis/pvacseq/SRR8281218/MHC_Class_I/tmp/SRR8281218.8.fa.split_801-1000']' returned non-zero exit status 1.). Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py", line 357, in call_iedb pvactools.lib.call_iedb.main(arguments) File "/usr/local/lib/python3.7/site-packages/pvactools/lib/call_iedb.py", line 46, in main raise err File "/usr/local/lib/python3.7/site-packages/pvactools/lib/call_iedb.py", line 41, in main (response_text, output_mode) = prediction_class_object.predict(args.input_file, args.allele, args.epitope_length, args.iedb_executable_path, args.iedb_retries, tmp_dir=args.tmp_dir, log_dir=args.log_dir) File "/usr/local/lib/python3.7/site-packages/pvactools/lib/prediction_class.py", line 79, in predict response = run(arguments, stdout=response_fh, check=True) File "/usr/local/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['/usr/local/bin/python', '/opt/iedb/mhc_i/src/predict_binding.py', 'netmhccons', 'HLA-A*03:01', '8', 'analysis/pvacseq/SRR8281218/MHC_Class_I/tmp/SRR8281218.8.fa.split_1-200']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/bin/pvacseq", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/main.py", line 123, in main args[0].func.main(args[1]) File "/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/run.py", line 142, in main pipeline.execute() File "/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py", line 451, in execute self.call_iedb(chunks) File "/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py", line 358, in call_iedb p.print("Making binding predictions on Allele %s and Epitope Length %s with Method %s - File %s - Completed" % (a, epl, method, filename)) File "/usr/local/lib/python3.7/site-packages/pymp/__init__.py", line 148, in __exit__ raise exc_t(exc_val) TypeError: __init__() missing 1 required positional argument: 'cmd'](url)

Please find the generated pVACseq input YAML attached here. pvacseq_inputs.txt

Please help us knowing what is causing the issue here.

susannasiebert commented 4 months ago

On our experience, errors from the IEDB software like these are usually intermittent and pVACseq will be able to proceed from the failure point when you restart the run. While we aren't sure of the exact failure cause we believe that they are related to running out of compute resources. You might try running with fewer threads, using a smaller --downstream-sequence-length parameter size (100 seems to seem to solve problems like these for a lot of users), a smaller --fasta-size, and/or using a machine with more memory.

ahkamzifo commented 4 months ago

@susannasiebert We tried with all the mentioned flags and also tried with reducing the threads, but to no avail, we are getting the same error when running the command via Snakemake workflow. However, when ran inside docker the same sample could successfully complete the run generating all the outputs.

susannasiebert commented 4 months ago

Hm, that's strange. Are you able to execute the following directly using singularity (assuming you still have the temporary output files for this run)?

/usr/local/bin/python /opt/iedb/mhc_i/src/predict_binding.py netmhccons HLA-C*04:01 8 analysis/pvacseq/SRR8281218/MHC_Class_I/tmp/SRR8281218.8.fa.split_801-1000
ahkamzifo commented 4 months ago

@susannasiebert Thanks for the reply. Let me try and will update you regarding the results.

ahkamzifo commented 4 months ago

HLA-C*04:01 8 analysis/pvacseq/SRR8281218/MHC_Class_I/tmp/SRR8281218.8.fa.split_801-1000

While trying to run the command you mentioned I get the following error:

INFO:    Using cached SIF image
INFO:    Converting SIF file to temporary sandbox...
FATAL:   while extracting /RIMA/.singularity/cache/oci-tmp/4a441a103f3acd7d81132c8569d51b9a8c28b67aa3cdda0d0af927761e3d0540: root filesystem extraction failed: extract command failed: WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
WARNING: Skipping mount /etc/hosts [binds]: /etc/hosts doesn't exist in container
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
WARNING: Skipping mount proc [kernel]: /proc doesn't exist in container
WARNING: Skipping mount /RIMA/miniconda3/var/singularity/mnt/session/tmp [tmp]: /tmp doesn't exist in container
WARNING: Skipping mount /RIMA/miniconda3/var/singularity/mnt/session/var/tmp [tmp]: /var/tmp doesn't exist in container
WARNING: Skipping mount /RIMA/miniconda3/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
Parallel unsquashfs: Using 48 processors
154730 inodes (254535 blocks) to write

[|                                                         ]   2592/409265   0%
FATAL ERROR: write_file: failed to create file /image/root/opt/iedb/mhc_i/method/netmhc-3.4-executable/netmhc_3_4_executable/etc/net/HLA-A31:01/sparse/synlist, because Too many open files
: exit status 1

The command I executed is below: singularity exec docker://griffithlab/pvactools:4.1.1 /usr/local/bin/python /opt/iedb/mhc_i/src/predict_binding.py netmhccon s HLA-C*04:01 8 analysis/pvacseq/SRR8281218/MHC_Class_I/tmp/SRR8281218.8.fa.split_801-1000

I am not very sure whether the error is with singularity or IEDB method. Do let me know incase if you have any suggestion to try. Thanks.

susannasiebert commented 4 months ago

I'm really not sure as I have no familiarity with singularity. Since things seem to work fine for you under the base docker image, I assume that it is related to singularity. You might be able to get more help from IEDB directly (help.iedb.org).

susannasiebert commented 1 month ago

Closing this issue due to inactivity.