ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
310 stars 90 forks source link

Installation! #128

Closed as7a5 closed 3 years ago

as7a5 commented 3 years ago

Upon the final stage if quick start installation on a redhat 7.4 cluster using singularity 3.1 and python I get the following error:

[siavoa01@bigpurple-ln1 pgpu372-sing31]$ ./pgap.py -r -o mg37_results test_genomes/MG37/input.yaml PGAP version 2021-01-11.build5132 is up to date. Output will be placed in: /gpfs/data/hpcadmin/pgap/pgpu372-sing31/mg37_results PGAP failed, docker exited with rc = 255 Unable to find error in log file. I appreciate any help in this regard.

azat-badretdin commented 3 years ago

Unable to find error in log file.

can you post the file?

as7a5 commented 3 years ago

H Azat, The content of cwltool.log:

[root@bigpurple-hn1 mg37_results]# cat cwltool.log Original command: ./pgap.py -r -o mg37_results test_genomes/MG37/input.yaml

Docker command: /gpfs/share/apps/singularity/3.1/bin/singularity exec --bind /gpfs/home/siavoa01/packages/pgap/pgpu365-sing31/input-2021-01-11.build5132:/pgap/input:ro --bind /gpfs/home/siavoa01/packages/pgap/pgpu365-sing31/test_genomes/MG37:/pgap/user_input --bind /gpfs/home/siavoa01/packages/pgap/pgpu365-sing31/test_genomes/MG37/pgap_input_4by5mm9z.yaml:/pgap/user_input/pgap_input.yaml:ro --bind /tmp:/tmp:rw --bind /gpfs/home/siavoa01/packages/pgap/pgpu365-sing31/mg37_results:/pgap/output:rw --pwd /pgap docker://ncbi/pgap:2021-01-11.build5132 cwltool --timestamps --debug --disable-color --preserve-entire-environment --outdir /pgap/output pgap/pgap.cwl /pgap/user_input/pgap_input.yaml

--- Start YAML Input --- fasta: class: File location: ASM2732v1.annotation.nucleotide.1.fasta submol: class: File location: pgap_submol_af7zxd5t.yaml supplemental_data: { class: Directory, location: /pgap/input } report_usage: true --- End YAML Input ---

--- Start Runtime Report --- { "CPU cores": 40, "Docker image": "ncbi/pgap:2021-01-11.build5132", "cpu flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_pt spec_ctrl ibpb_support tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req", "cpu model": "Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz", "max user processes": 4096, "memory (GiB)": 376.6, "memory per CPU core (GiB)": 9.4, "open files": 65536, "tmp disk space (GiB)": 1740.4, "virtual memory": "unlimited", "work disk space (GiB)": 1749531.2 } --- End Runtime Report ---

WARNING: Could not set container working directory /pgap: chdir /pgap: no such file or directory FATAL: container creation failed: mount error: can't remount /pgap/user_input/pgap_input.yaml: no such file or directory

On Feb 19, 2021, at 1:32 PM, Azat Badretdin notifications@github.com wrote:

Unable to find error in log file.

can you post the file?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ncbi/pgap/issues/128#issuecomment-782259770, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEW66Q6COMB3BVLPKBBE43S72VCBANCNFSM4X4Z2JGQ.

azat-badretdin commented 3 years ago

Looks like integrity of the Docker image or container got somehow compromised on your system.

What does docker run -i ncbi/pgap:2021-01-11.build5132 ls -l / tells you?

Here is ours:


$ docker run -i ncbi/pgap:2021-01-11.build5132 ls -l /
total 72
-rw-r--r--   1 root root 12123 Oct  1  2019 anaconda-post.log
lrwxrwxrwx   1 root root     7 Oct  1  2019 bin -> usr/bin
drwxr-xr-x   5 root root   340 Feb 22 14:57 dev
drwxr-xr-x   1 root root  4096 Feb 22 14:57 etc
drwxr-xr-x   1 root root  4096 Jan 12 08:25 home
lrwxrwxrwx   1 root root     7 Oct  1  2019 lib -> usr/lib
lrwxrwxrwx   1 root root     9 Oct  1  2019 lib64 -> usr/lib64
drwxr-xr-x   2 root root  4096 Apr 11  2018 media
drwxr-xr-x   2 root root  4096 Apr 11  2018 mnt
drwxr-xr-x   3 root root  4096 Jan 12 08:23 netmnt
drwxr-xr-x   1 root root  4096 Jan 12 08:23 netopt
drwxr-xr-x   1 root root  4096 Jan 12 08:23 opt
drwxr-xr-x   1 root root  4096 Jan 12 06:34 panfs
drwxr-xr-x   1 root root  4096 Jan 12 09:19 pgap
dr-xr-xr-x 356 root root     0 Feb 22 14:57 proc
dr-xr-x---   1 root root  4096 Jan 12 09:18 root
drwxr-xr-x   1 root root  4096 Jan 12 06:34 run
lrwxrwxrwx   1 root root     8 Oct  1  2019 sbin -> usr/sbin
drwxr-xr-x   2 root root  4096 Apr 11  2018 srv
dr-xr-xr-x  13 root root     0 Feb 18 15:09 sys
drwxrwxrwt   1 root root  4096 Jan 12 09:19 tmp
drwxr-xr-x   1 root root  4096 Oct  1  2019 usr
drwxr-xr-x   1 root root  4096 Oct  1  2019 var
pirale commented 3 years ago

Hi Azat,

I'm in the same team, trying to get PGAP on our cluster to run.

If you ls the container, you can see the pgap directory.

singularity exec -i pgap_2021-01-11.build5132.sif ls -l /
total 40
-rw-r--r--   1 root     root     12123 Sep 30  2019 anaconda-post.log
lrwxrwxrwx   1 root     root         7 Sep 30  2019 bin -> usr/bin
drwxr-xr-x  21 root     root      3900 Feb 10 12:54 dev
lrwxrwxrwx   1 root     root        36 Jan 27 14:31 environment -> .singularity.d/env/90-environment.sh
drwxr-xr-x  58 root     root      2262 Jan 27 14:31 etc
drwxr-xr-x   4 pirona01 pirona01    80 Feb 22 12:52 gpfs
drwxr-xr-x   3 root     root        28 Jan 12 03:25 home
lrwxrwxrwx   1 root     root         7 Sep 30  2019 lib -> usr/lib
lrwxrwxrwx   1 root     root         9 Sep 30  2019 lib64 -> usr/lib64
drwxr-xr-x   2 root     root         3 Apr 11  2018 media
drwxr-xr-x   2 root     root         3 Apr 11  2018 mnt
drwxr-xr-x   3 root     root        33 Jan 12 03:23 netmnt
drwxr-xr-x   3 root     root        35 Jan 12 03:23 netopt
drwxr-xr-x   9 root     root       152 Jan 12 03:23 opt
drwxr-xr-x   3 root     root        74 Jan 12 01:34 panfs
drwxr-xr-x   5 root     root        76 Jan 12 04:19 pgap
dr-xr-xr-x 890 root     root         0 Feb  7 13:27 proc
dr-xr-x---   4 root     root       181 Jan 12 04:18 root
drwxr-xr-x  13 root     root       186 Jan 12 01:34 run
lrwxrwxrwx   1 root     root         8 Sep 30  2019 sbin -> usr/sbin
lrwxrwxrwx   1 root     root        24 Jan 27 14:31 singularity -> .singularity.d/runscript
drwxr-xr-x   2 root     root         3 Apr 11  2018 srv
dr-xr-xr-x  13 root     root         0 Feb  7 13:28 sys
drwxrwxrwt  49 root     root     28672 Feb 22 12:52 tmp
drwxr-xr-x  13 root     root       260 Sep 30  2019 usr
drwxr-xr-x  18 root     root       271 Sep 30  2019 var

If you set the working directory within the same command and ls that same directory:

singularity exec -i --pwd /pgap pgap_2021-01-11.build5132.sif ls -l /pgap
WARNING: Could not set container working directory /pgap: chdir /pgap: no such file or directory
total 0
drwxr-xr-x  2 root root   3 Jan 12 04:19 input
drwxr-xr-x 22 root root 915 Jan 12 04:18 pgap
drwxr-xr-x  7 root root 134 Jan 12 04:18 venv

It's super weird because the directory is present in the container.

Thanks so much for your help!

azat-badretdin commented 3 years ago

Thanks for the input, Alejandro. Looks like there is something specific to singularity here most likely. Interestingly, I am able to use docker image directly from singularity without a .sif transformation:


$ singularity exec -i --pwd /pgap docker://ncbi/pgap:2021-01-11.build5132 ls -l /pgap
INFO:    Using cached SIF image
total 0
drwxr-xr-x  2 root root   3 Jan 12 04:19 input
drwxr-xr-x 22 root root 915 Jan 12 04:18 pgap
drwxr-xr-x  7 root root 134 Jan 12 04:18 venv
azat-badretdin commented 3 years ago

Is it possible that .sif file got somehow compromised? I have very little knowledge of singularity

Would you mind executing the same ls canary command using this type original docker URI as an image specifier?

pirale commented 3 years ago

Hey Azat,

I can also run:

singularity exec -i --pwd /pgap docker://ncbi/pgap:2021-01-11.build5132 ls -l /pgap
WARNING: Could not set container working directory /pgap: chdir /pgap: no such file or directory
total 0
drwxr-xr-x  2 root root   3 Jan 12 04:19 input
drwxr-xr-x 22 root root 915 Jan 12 04:18 pgap
drwxr-xr-x  7 root root 134 Jan 12 04:18 venv

but get the same result, as you see. pgap runs using singularity without problems in the UGE HPC of another institution I have access to, but unfortunately not where I need it.

I have started afresh several times; if the container is getting corrupted, then it is happening each time I download it.

I am not acquainted with canary; could you please post a link?

azat-badretdin commented 3 years ago

Hi, Alejandro!

I am not acquainted with canary; could you please post a link?

"canary" here is a synonym for a "test" (like "canary in the mine")

Your result is specific to a particular singularity configuration on a particular system. I am afraid we can't help much here, maybe your systems would help?

pirale commented 3 years ago

Thank you for your help, Azat!

Alejandro

azat-badretdin commented 3 years ago

You are welcome, Alejandro!