griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
137 stars 59 forks source link

Temporary directory for pvacseq #1026

Closed ysbioinfo closed 9 months ago

ysbioinfo commented 11 months ago

Installation Type

Standalone

pVACtools Version / Docker Image

4.0.1

Python Version

3.8.3

Operating System

Centos7

Describe the bug

Hi Susanna, I got an error after the MHCI and MHCII epitopes being predicted, it said there is no space left in my disk. However, I can ensure that there is enough space in the output_dir. Hence I wonder if there is a temp_dir being used by pvacseq, if its default path is in my home directory (only 100MB left) then all could be explained. If so, besides cleaning my home directory, is there any other solution, like manually assigning a temp_dir by myself? Thanks!

Yang

Below is my error output:

Parsing prediction file for Allele DRB1*07:01 and Epitope Length 24 - Entries 20801-20820
Parsing prediction file for Allele DRB1*07:01 and Epitope Length 24 - Entries 20801-20820 - Completed
Parsing binding predictions for Allele DRB1*07:01 and Epitope Length 25 - Entries 20801-20820
Parsing prediction file for Allele DRB1*07:01 and Epitope Length 25 - Entries 20801-20820
Parsing prediction file for Allele DRB1*07:01 and Epitope Length 25 - Entries 20801-20820 - Completed
Combining Parsed Prediction Files
Completed
Creating aggregated report
Tumor clonal VAF estimated as 0.5 (estimated from Tumor DNA VAF data). Assuming variants with VAF < 0.25 are subclonal
Completed
Calculating Manufacturability Metrics
Completed
Running Binding Filters
Traceback (most recent call last):
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/tools/pvacseq/main.py", line 123, in main
    args[0].func.main(args[1])
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/tools/pvacseq/run.py", line 165, in main
    pipeline.execute()
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/lib/pipeline.py", line 484, in execute
    PostProcessor(**post_processing_params).execute()
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/lib/post_processor.py", line 59, in execute
    self.execute_binding_filter()
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/lib/post_processor.py", line 136, in execute_binding_filter
    BindingFilter(
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/lib/binding_filter.py", line 52, in execute
    Filter(self.input_file, self.output_file, filter_criteria).execute()
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/site-packages/pvactools/lib/filter.py", line 34, in execute
    writer.writerow(line)
  File "/mnt/efs/NGS/yangshi/software/anaconda3/envs/pvactools/lib/python3.8/csv.py", line 154, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
OSError: [Errno 28] No space left on device

How to reproduce this bug

See above

Input files

No response

Log output

See above

Output files

No response

susannasiebert commented 11 months ago

There are definitely modules in pVACtools that use temporary files, usually using the tempfile module. The tempfile module determines the tmp file location as follows:

Python searches a standard list of directories to find one which the calling user can create files in. The list is:

  • The directory named by the TMPDIR environment variable.
  • The directory named by the TEMP environment variable.
  • The directory named by the TMP environment variable.
  • A platform-specific location:
    • On Windows, the directories C:\TEMP, C:\TMP, \TEMP, and \TMP, in that order.
    • On all other platforms, the directories /tmp, /var/tmp, and /usr/tmp, in that order.
  • As a last resort, the current working directory.

You can check what the current tmp directory is by running the following two commands from inside a python session:

>>> import tempfile
>>> tempfile.gettempdir()

You can overwrite the default tmp file location by setting the TMPDIR environment variable before running your pvacseq command, e.g. like so:

TMPDIR='/my/desired/tmp/dir' pvacseq run ...

Please be aware that the directory you want to use needs to already exist, otherwise the value will be ignored and the default tmp dir location will be used.

N-McC commented 11 months ago

Hello, I seem to be getting the same error. The only difference is that I am using the docker image. When executing pvacseq run I run into the error (example and error out below). When launching the .sif file I can see the $TMPDIR has been changed to the manually assigned one, but still run into the same error with no space left on the device.
Do you have any suggestions for a fix? Thanks for the help. Neil

Example Input code:

export TMPDIR=/my/tmp/dir/
TMPDIR='/my/tmp/dir/'

apptainer exec -C \
-B /my/pvac/outdir:/pvac.dir \
-B /my/tmp/dir/ \
pvactools_latest.sif pvacseq run \
/pvac.dir/my.vcf sample_id \
HLA-A*02:01,HLA-A*11:01,HLA-B*54:01,HLA-B*54:01,HLA-C*01:02,HLA-C*01:02 \
MHCflurry MHCnuggetsI NetMHC PickPocket SMM SMMPMBEC NetMHCcons /pvac.dir -e1 9 -t 24 --iedb-install-directory /opt/iedb 

Errors are as follows:

Calculating Manufacturability Metrics Traceback (most recent call last): File "/usr/local/bin/pvacseq", line 8, in sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/main.py", line 116, in main args[0].func.main(args[1]) File "/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/run.py", line 133, in main pipeline.execute() File "/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py", line 466, in execute PostProcessor(**post_processing_params).execute() File "/usr/local/lib/python3.7/site-packages/pvactools/lib/post_processor.py", line 32, in execute self.calculate_manufacturability() File "/usr/local/lib/python3.7/site-packages/pvactools/lib/post_processor.py", line 55, in calculate_manufacturability CalculateManufacturability(self.input_file, self.manufacturability_fh.name, self.file_type).execute() File "/usr/local/lib/python3.7/site-packages/pvactools/lib/calculate_manufacturability.py", line 63, in execute writer.writerow(line) File "/usr/local/lib/python3.7/csv.py", line 155, in writerow return self.writer.writerow(self._dict_to_list(rowdict)) OSError: [Errno 28] No space left on device

susannasiebert commented 11 months ago

@N-McC Are you sure that /my/tmp/dir/ is being mounted in the correct spot? What does the following return:

export TMPDIR=/my/tmp/dir/
TMPDIR='/my/tmp/dir/'

apptainer exec -C \
-B /my/pvac/outdir:/pvac.dir \
-B /my/tmp/dir/ \
pvactools_latest.sif ls $TMPDIR

Edit to add: using Docker on my end, when my mount /my/tmp/dir with -v /my/tmp/dir the folder/path exists inside the docker container but I can't see any of the existing files. When I mount it with -v /my/tmp/dir:/my/tmp/dir I can see existing files so try changing the mount parameter to -B /my/tmp/dir:/my/tmp/dir

N-McC commented 11 months ago

Hi, Thanks for your advice, I double checked the tmp folders were being mounted and it turns out adding -C was overwriting my desired $TMPDIR. I added a test.txt file to the desired tmp folder and these were the results. I have successfully completed one of my problematic samples so is now working for me.

adding -B /my/tmp/dir:/my/tmp/dir also works in these cases, but is not necessary as they also run fine without- this could just be a nice quirk of apptainer.

These 3 work perfectly, I would suggest '/bin/bash -c' for future troubleshooting .sif as the tmp directory can be tested alongside our other scripts and it won't overwrite our mounted directory.

# correctly gives $TMPDIR: 
apptainer exec -B /my/tmp/dir/ pvactools_4.0.4.sif /bin/bash -c  'echo $TMPDIR; ls $TMPDIR'

test.txt

# correctly gives $TMPDIR: 
apptainer exec -B /my/tmp/dir/ pvactools_4.0.4.sif echo $TMPDIR

/my/tmp/dir/

# correctly lists $TMPDIR: 
apptainer exec -B /my/tmp/dir/ pvactools_4.0.4.sif ls $TMPDIR

test.txt

These options do not mount the directory properly, it also seems any use of -C causes issues here.

# adding -C overrides $TMPDIR: 
apptainer exec -C -B /my/tmp/dir/ pvactools_4.0.4.sif ls $TMPDIR

No such file or directory

# adding -c  except in the format "/bin/bash -c" overrides $TMPDIR: 
apptainer exec -c -B /my/tmp/dir/ pvactools_4.0.4.sif ls $TMPDIR

No such file or directory

susannasiebert commented 9 months ago

It looks like this issue has been resolved. Closing this ticket.