icbi-lab / nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Other
67 stars 23 forks source link

error message when run a test case #52

Closed ryao-mdanderson closed 10 months ago

ryao-mdanderson commented 1 year ago

Dear NextNEOpi authors:

I follow README, has the nextflow version 22.10.8 installed, set up reference in a customized site and modified resourcesBaseDir in conf/params.conf

Tried the following command to run a test case

$ nextflow run nextNEOpi.nf --batchFile testdata_batchFile_FASTQ.csv --CNVkit false -profile singularity -config conf/params.config --accept_license --TCR false

at the end of the output, I am seeing:


Execution cancelled -- Finishing pending tasks before exit [icbi/nextNEOpi] Pipeline Complete! You can find your results in /rsrch3/home/itops/ryao/nextNEOpi/results Error executing process > 'install_IEDB (Install IEDB)'

Caused by: Process install_IEDB (Install IEDB) terminated with an error exit status (3)

Command executed:

export TMPDIR=/tmp/ryao/nextNEOpi/

CWD=pwd cd /opt/iedb/ rm -f IEDB_MHC_I-3.1.4.tar.gz wget https://downloads.iedb.org/tools/mhci/3.1.4/IEDB_MHC_I-3.1.4.tar.gz tar -xzvf IEDB_MHC_I-3.1.4.tar.gz cd mhc_i bash -c "./configure" cd /opt/iedb/ rm -f IEDB_MHC_I-3.1.4.tar.gz

rm -f IEDB_MHC_II-3.1.8.tar.gz wget https://downloads.iedb.org/tools/mhcii/3.1.8/IEDB_MHC_II-3.1.8.tar.gz tar -xzvf IEDB_MHC_II-3.1.8.tar.gz

ATTENTION: IEDB_MHC_II-3.1.8.tar.gz "python configure.py"

returns an assertion error in the unittest needs

to be fixed, skip unittests for now

cd mhc_ii

bash -c "python ./configure.py"

cd /opt/iedb/ rm IEDB_MHC_II-3.1.8.tar.gz

export MHCFLURRY_DATA_DIR=/opt/mhcflurry_data mhcflurry-downloads fetch

cd $CWD echo "OK" > .iedb_install_ok.chck

Command exit status: 3

Command output: (empty)

Command error: --2023-09-28 17:01:10-- https://downloads.iedb.org/tools/mhci/3.1.4/IEDB_MHC_I-3.1.4.tar.gz Resolving downloads.iedb.org (downloads.iedb.org)... 8.37.117.143 Connecting to downloads.iedb.org (downloads.iedb.org)|8.37.117.143|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 345348484 (329M) [application/x-gzip] IEDB_MHC_I-3.1.4.tar.gz: Permission denied

Cannot write to ‘IEDB_MHC_I-3.1.4.tar.gz’ (Permission denied).

Work dir: /rsrch3/home/itops/ryao/nextNEOpi/work/6b/e0e98264330d8bc14d89ee7c13443f

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

--------- end of the command output

Would you please provide me some helps to understand this?

Thank you very much, Rong Yao

riederd commented 1 year ago

Hi, it seems that you do not have permissions to write to your resourcesBaseDir you specified. See:

Cannot write to ‘IEDB_MHC_I-3.1.4.tar.gz’ (Permission denied).

Can you show the output of the following commands: id ls -la your_resourcesBaseDir (Please substitute your_resourcesBaseDir with the real path of that directory you configured)

ryao-mdanderson commented 1 year ago

Hello @riederd

You are right. my userid doesn't have write permission. my userid is ryao. my resourceBaseDir configure in conf/param.conf is resourcesBaseDir = "/rsrch3/scratch/reflib/REFLIB_data/nextneopi-1.4.0/"

ls -la /rsrch3/scratch/reflib/REFLIB_data/nextneopi-1.4.0/ total 42316629 drwxr-xr-x 6 root root 4096 Sep 28 17:18 . drwxr-xr-x 70 root root 8192 Sep 29 09:35 .. drwxr-xr-x 7 40005 50000 4096 Sep 29 09:16 databases drwxr-xr-x 4 40005 50000 4096 May 3 2021 ExomeCaptureKits -rw-r--r-- 1 root root 345348484 Jan 2 2023 IEDB_MHC_I-3.1.4.tar.gz -rw-r--r-- 1 ryao rists 42662861027 Mar 22 2022 nextNEOpi_1.4_resources.tar.gz -rw-r--r-- 1 ryao rists 323911161 Jul 27 09:24 nextNEOpi_testdata.tar.gz drwxr-xr-x 7 40005 50000 4096 Mar 1 2022 references drwxr-xr-x 2 40005 50000 4096 Jul 26 10:17 testdata

I am the system admin and have root privilege, I would like to store the resources in a shared location, so that our cluster users can also use this resource. So, if I change directory ownership to be root:root, allow group and others read permission, is okay?

The readme does not mention to install IEDB ahead of time until I ran the test and found out... I manually download IEDB_MHC_I-3.1.4.tar.gz and untar it, configured iedb in location cd mhc_i bash -c ./configure

is this a right procedure?

[root@ldragon2 mhc_i]# ls -l /rsrch3/scratch/reflib/REFLIB_data/nextneopi-1.4.0/databases/iedb/mhc_i total 389 -rwxrwxr-x 1 p_matlab p_matlab 19 Jan 2 2023 configure -rw-rw-r-- 1 p_matlab p_matlab 11015 Jan 2 2023 Copenhagen_license.txt drwxrwxr-x 3 p_matlab p_matlab 4096 Jan 2 2023 data drwxrwxr-x 2 p_matlab p_matlab 4096 Jan 2 2023 examples -rw-rw-r-- 1 p_matlab p_matlab 11839 Jan 2 2023 LIAI_license.txt drwxrwxr-x 16 p_matlab p_matlab 4096 Jan 2 2023 method -rw-rw-r-- 1 p_matlab p_matlab 134656 Jan 2 2023 mhc_list.xls -rw-rw-r-- 1 p_matlab p_matlab 4795 Jan 2 2023 README drwxrwxr-x 2 p_matlab p_matlab 4096 Sep 29 09:28 src

Now I wonder the ownership p_matlab.

Thank you for your help.

ryao-mdanderson commented 1 year ago

I forgot to mention, mhc_i/src/configure.py has an installation path limit 57 characters. In order to configure, I change the limit to 100, and it configured. Is this Okay? any reason for 57?

ryao-mdanderson commented 1 year ago

Hi @riederd

Update in my today's work:

  1. I manually download IEDB_MHC_II-3.1.4.tar.gz, changed configure.py to allow a directory path length up to 100, then installed databases/mhc_i successfully

  2. changed the ownership of /rsrch3/scratch/reflib/REFLIB_data/nextneopi-1.4.0/ to be a service account, with group and others read permission for the directory tree.

  3. Because IEDB has been manually installed, created a non empty flag file databases/iedb/.iedb_install_ok.chck this is to avoid IEDB install triggered during the test run.

4, changed Tmp_dir in conf/param.config to make sure tmp directory has > 50GB

Then I ran the test case again, unfortunately, after 30 minutes run, still hit an issue to pull a singularity image

Error executing process > 'installVEPcache (installVEPcache)' Caused by: Failed to pull singularity image command: singularity pull --name depot.galaxyproject.org-singularity-ensembl-vep-110.0--pl5321h2a3209d_0.img.pulling.1696016364914 https://depot.galaxyproject.org/singularity/ensembl-vep:110.0--pl5321h2a3209d_0 > /dev/null status : 143

At this point, the test run is not successful.

Thank you for you help.

riederd commented 1 year ago

Hi,

Can you post the contents (.tar.gz) of the work dir of the failed process?

Did you retry? What happens when you run the process manually? e.g.:

cd <workdir of the failed process>
bash .command.run
riederd commented 12 months ago

Any feedback on this?

ryao-mdanderson commented 12 months ago

Hi @riederd Dietmar,

I am very sorry for a late reply. One of the reasons is I continued to hit singularity pull ensembl-vep. Not only during the nextflow run, but also with manually command pull. With this error could not be resolved, I am stuck on nextflow test run.

[ryao@myserver ~]$ singularity pull depot.galaxyproject.org-singularity-ensembl-vep-110.0--pl5321h2a3209d_0.img.pulling.1696016364914 https://depot.galaxyproject.org/singularity/ensembl-vep:110.0--pl5321h2a3209d_0

INFO: Downloading network image 629.7MiB / 991.6MiB [=================================================>----------------------------] 64 % 358.3 KiB/s 17m14s FATAL: net/http: request canceled (Client.Timeout exceeded while reading body)

The above is an example of my singularity pull. By the way, are you able to pull on your server?

For IEDB installation. I believe your application has handled this automatically, since I was able to manually installed IEDB, I was curious to get nextflow test work first. Therefore I did not retry, partially due to the reference directory in a shared directory that controlled by GPFS file system with access control. Simply change the write access for a non root user will not work.

Thank you for your help. Rong Yao

riederd commented 12 months ago

Hi, it seems that you are having some sort of (temporary?) network issues. I just tried to pull the image and it went smoothly, i took 2 minutes.

You might try to set a longer timeout for singularity image pulls, e.g. at line: https://github.com/icbi-lab/nextNEOpi/blob/fe7b21cdc0b97aae38195e5f5ac1b9851674f6b1/conf/profiles.config#L91 add the following:

singularity.pullTimeout = 3600
ryao-mdanderson commented 12 months ago

Hi @riederd,

I followed your suggestion to add singularity.pullTimeout = 3600 in conf/profile.config and rerun the nextNEOpi test data, this time it exited quickly due to pull fastqc image. I have attached a file with the full output in this message.

I also get a try to manually singularity pull ensembl-vep-110.0 again, after 17 minutes, it exited with same error as I reported.

I wonder you get to pull the image in 2 minutes, is due to physical location? I guess depot.galaxyproject.org hosted in Europe and I worked in Texas USA.

I feel it is frustrating to get it work. nextNEOpi-test.txt

Thank you , Rong Yao

riederd commented 11 months ago

I'm sorry that you are running into these issues, which apparently are caused by something not specific to nextNEOpi, since even pulling the image manually fails. Unfortunately I can not reproduce them, so it gets difficult for me to pinpoint the root cause.

Are there any limits set on the system you are using, e.g. memory, cputime

ryao-mdanderson commented 11 months ago

@riederd you have been very helpful already, thank you.

Today, I finally successfully manually pull ensembl-vep image on a HPC cluster node, it took ~30 minutes; Comparing to your pull in 2 minutes, a significant difference.

I am waiting for a firewall open for https://apps-01.i-med.ac.at-images on the cluster node. I will test the nextflow again when this is ready.

Cheers, Rong

riederd commented 10 months ago

I'm closing this now