icbi-lab / nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Other
65 stars 23 forks source link

Error running NeoFuse- cannot find models_class1_pan/models.combined/manifest.csv #25

Closed mantczakaus closed 1 year ago

mantczakaus commented 1 year ago

Hi, Thank you for this amazing pipeline! I am currently running it WES and RNA-seq data and I'm having trouble with running NeoFuse. The content of command.log is as following:

INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
INFO:    fuse: warning: library too old, some operations may not work
[-------------------------------- [NeoFuse] --------------------------------]

[NeoFuse]  Paired End (PE) Reads detected: commencing processing
[NeoFuse]  Processing files TESLA_3_1.fastq.gz - TESLA_3_2.fastq.gz
[NeoFuse]  STAR Run started at: 16:13:03
[NeoFuse]  Arriba Run started at: 16:13:03
[NeoFuse]  Parsing custom HLA list: 18:08:02
[NeoFuse]  featureCounts Run started at: 18:08:02
[NeoFuse]  Converting Raw Counts to TPM and FPKM: 18:09:38
[NeoFuse]  Searching for MHC I peptides of length 8 9 10 11 : 18:09:39
[NeoFuse]  Searching for MHC II peptides of length 15 16 17 18 19 20 21 22 23 24 25 : 18:09:39
[NeoFuse]  MHCFlurry Run started at: 18:09:39
An error occured while creating the MHCFlurry temp files, check ./patient1/LOGS/patient1_MHCI_final.log for more details

./patient1/LOGS/patient1_MHCI_final.log 's content:

Traceback (most recent call last):
  File "/usr/local/bin/source/build_temp.py", line 122, in <module>
    final_out(inFile, outFile)
  File "/usr/local/bin/source/build_temp.py", line 61, in final_out
    with open(assoc_file) as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: './patient1/NeoFuse/tmp/MHC_I/patient1_8_NUP133_ABCB10_1_8.tsv'

I also checked the content of the patient1_X_MHCFlurry.log. The all say:

Traceback (most recent call last):
  File "/usr/local/bin//mhcflurry-predict", line 8, in <module>
    sys.exit(run())
  File "/usr/local/lib/python3.6/dist-packages/mhcflurry/predict_command.py", line 207, in run
    affinity_predictor = Class1AffinityPredictor.load(models_dir)
  File "/usr/local/lib/python3.6/dist-packages/mhcflurry/class1_affinity_predictor.py", line 480, in load
    manifest_df = pandas.read_csv(manifest_path, nrows=max_models)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 688, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 454, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 948, in __init__
    self._make_engine(self.engine)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 1180, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 2010, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/manifest.csv'

What I did to debug it is as I downloaded the current NeoFuse container from https://github.com/icbi-lab/NeoFuse but the run with it resulted in the same errors. Is it possible that the default mhcflurry changed? I'm not that familiar with mhcflurry or NeoFuse - maybe you could point me into the right direction?

abyssum commented 1 year ago

Hello @mantczakaus,

Thank you for using nextNEOpi.

This is a weird behavior... can you pull the image locally with smth like: wget --no-check-certificate https://apps-01.i-med.ac.at/images/singularity/NeoFuse_dev_0d1d4169.sif

then run: singularity exec NeoFuse_dev_0d1d4169.sif ls /home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/

and paste the results?

riederd commented 1 year ago

Moreover, can you also send the contents of the work dir in which the pipeline failed.

mantczakaus commented 1 year ago

Thank @abyssum for such a prompt response! I run the commands you asked for as an interactive job on my hpc. The result is the similar - it cannot see the folder

/bin/ls: cannot access '/home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/': No such file or directory I also tried this: singularity exec NeoFuse_dev_0d1d4169.sif ls /home/neofuse and same thing: '/bin/ls: cannot access '/home/neofuse': No such file or directory'. Could there be some extra singularity options that I would need to run it? All the other containers that the nextNEOpi was using up to NeoFuse worked fine though.

mantczakaus commented 1 year ago

Moreover, can you also send the contents of the work dir in which the pipeline failed.

Thank you @riederd for coming back to me! Here are all the run and log files from that work folder. work_NeoFuse.zip

riederd commented 1 year ago

Can you run the following commands and send the output?

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "ls -la /home/neofuse"

and

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "mount"

Thanks

mantczakaus commented 1 year ago

Can you run the following commands and send the output?

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "ls -la /home/neofuse"

and

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "mount"

Thanks

Hi @riederd

The first command gave the following output:ls: cannot access '/home/neofuse': No such file or directory The output of the second command attached mount.txt

Thanks!

riederd commented 1 year ago

Thanks,

can you try again but with the option --containall added after --no-home

mantczakaus commented 1 year ago

Thanks,

can you try again but with the option --containall added after --no-home

For the first command contain_ls.txt For the second command contain_mount.txt I also run the following command singularity exec --containall apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif ls -la /home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/ in the folder with the container downloaded by the pipeline (work/singularity) and it gave me the following list of files:

-rw-r--r-- 1 uqmantcz qris-uq   760514 Jun 10  2020 allele_sequences.csv
-rw-r--r-- 1 uqmantcz qris-uq 59372077 Jun 10  2020 frequency_matrices.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq      102 Jun 10  2020 info.txt
-rw-r--r-- 1 uqmantcz qris-uq  1012279 Jun 10  2020 length_distributions.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq   115260 Jun 10  2020 manifest.csv
-rw-r--r-- 1 uqmantcz qris-uq  4483361 Jun 10  2020 model_selection_data.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq   215596 Jun 10  2020 model_selection_summary.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq 83090609 Jun 10  2020 percent_ranks.csv
-rw-r--r-- 1 uqmantcz qris-uq  4488832 Jun 10  2020 train_data.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq 11261512 Jun 10  2020 weights_PAN-CLASS1-1-05734e73adff1f25.npz
-rw-r--r-- 1 uqmantcz qris-uq 11261512 Jun 10  2020 weights_PAN-CLASS1-1-0c7c1570118fd907.npz
-rw-r--r-- 1 uqmantcz qris-uq  9160264 Jun 10  2020 weights_PAN-CLASS1-1-24d9082b2c8d7a60.npz
-rw-r--r-- 1 uqmantcz qris-uq  4582984 Jun 10  2020 weights_PAN-CLASS1-1-3ed9fb2d2dcc9803.npz
-rw-r--r-- 1 uqmantcz qris-uq  9160264 Jun 10  2020 weights_PAN-CLASS1-1-8475f7a9fb788e27.npz
-rw-r--r-- 1 uqmantcz qris-uq  5821000 Jun 10  2020 weights_PAN-CLASS1-1-9e049de50b72dc23.npz
-rw-r--r-- 1 uqmantcz qris-uq  7396364 Jun 10  2020 weights_PAN-CLASS1-1-9f7dfdd0c2763c42.npz
-rw-r--r-- 1 uqmantcz qris-uq  4845580 Jun 10  2020 weights_PAN-CLASS1-1-b17c8628ffc4b80d.npz
-rw-r--r-- 1 uqmantcz qris-uq  9160264 Jun 10  2020 weights_PAN-CLASS1-1-ce288787fc2f6872.npz
-rw-r--r-- 1 uqmantcz qris-uq  7396364 Jun 10  2020 weights_PAN-CLASS1-1-e33438f875ba4af2.npz

Thank you!

riederd commented 1 year ago

Great, so I'd suggest to change https://github.com/icbi-lab/nextNEOpi/blob/aac260dbd5da701d22a09846045f3f42981cd4a1/conf/params.config#L274 to:

runOptions =  "--no-home --containall" + " -H " + params.singularityTmpMount + " -B " +  params.singularityAssetsMount + " -B " + params.singularityTmpMount + " -B " + params.resourcesBaseDir + params.singularityHLAHDmount + " -B " + params.databases.IEDB_dir + ":/opt/iedb" + " -B " + params.databases.MHCFLURRY_dir + ":/opt/mhcflurry_data"

I'm not sure if you would hit an issue elsewhere with this change, but it is worth trying. Let us know it it works, we might change it in the next version then.

mantczakaus commented 1 year ago

Thank you! I've just launched the pipeline with the changed config file. I'll let you know how it goes.

mantczakaus commented 1 year ago

It worked - thank you!