ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
310 stars 90 forks source link

fatal error : shell /bin/sh doesn't exist in container #235

Closed vdejager closed 1 year ago

vdejager commented 1 year ago

Describe the bug I'm getting the following fatal error when trying to run pgap: shell /bin/sh doesn't exist in container Traceback (most recent call last): File "/tools/software/bioinfo-tools/pipelines/pgap/pgap.py", line 965, in main retcode = p.launch() File "/tools/software/bioinfo-tools/pipelines/pgap/pgap.py", line 477, in launch self.record_runtime(f) File "/tools/software/bioinfo-tools/pipelines/pgap/pgap.py", line 406, in record_runtime result = subprocess.run(cmd, stdin=subprocess.DEVNULL, check=True, stdout=subprocess.PIPE) File "/usr/lib64/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['/opt/ohpc/pub/libs/singularity/3.4.1/bin/singularity', 'exec', '--bind', '/tmp:/cwd:ro', '/tools/software/bioinfo-tools/pipelines/pgap', 'bash', '-c', 'df -k /cwd /tmp ; ulimit -a ; cat /proc/{meminfo,cpuinfo}']' returned non-zero exit status 255.

To Reproduce `pgap.py --no-internet --no-self-update --container-path $PGAP_INPUT_DIR -v -n -o /home/vdejager/pgap_test/results /home/vdejager/pgap_test/input.yaml

`Expected behavior A clear and concise description of what you expected to happen. a running annotation. The input yaml annotates using pgap on another system with the same singularity image with apptainer version 1.1.4-2.el8

Software versions (please complete the following information):

Log Files Please rerun pgap.py with the --debug flag and attach an archive (e.g. zip or tarball) of the logs in the directory: debug/tmp-outdir/*/*.log. no debug log created

Additional context Add any other context about the problem here.

azat-badretdin commented 1 year ago

Thank you for your report, Vic!

This part is strange:

 '['/opt/ohpc/pub/libs/singularity/3.4.1/bin/singularity', 'exec', '--bind', '/tmp:/cwd:ro', '/tools/software/bioinfo-tools/pipelines/pgap', 'bash', '-c', 'df -k /cwd /tmp ; ulimit -a ; cat /proc/{meminfo,cpuinfo}']' 

The parameter between bind specs and bash should be either a container spec or container path. But you are supplying a PGAP directory.

The --help says:

  --container-path CONTAINER_PATH
                        Override path to image.

Could you please fix your command line (--container-path part) and try again?

vdejager commented 1 year ago

the installation location of pgap is in my case: /tools/software/bioinfo-tools/pipelines/pgap my environment contains:

export PGAP_INPUT_DIR=/tools/software/bioinfo-tools/pipelines/pgap
export PATH=$PATH:$PGAP_INPUT_DIR

I'm using the following commandline now, starting in /work/vdejager/test:

pgap.py -v -c 8 -d -n --no-self-update --auto-correct-tax --container-path /tools/software/bioinfo-tools/pipelines/pgap -o test_10012023 /work/vdejager/test/input.yaml

If I run the command from /tools/software/bioinfo-tools/pipelines/pgap (where the pipeline is installed) with the container-path option, I get the same error message:

/tools/software/bioinfo-tools/pipelines/pgap$ ./pgap.py -v -c 8 -d -n --no-self-update --auto-correct-tax --container-path /tools/software/bioinfo-tools/pipelines/pgap -o test_10012023 /work/vdejager/test/input.yaml
PGAP version 2022-12-13.build6494 is up to date.
Not trying to update self, because the --no-self-update flag is enabled.
Output will be placed in: /tools/software/bioinfo-tools/pipelines/pgap/test_10012023.2
WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
FATAL:   shell /bin/sh doesn't exist in container
Traceback (most recent call last):
  File "./pgap.py", line 986, in <module>
    main()
  File "./pgap.py", line 965, in main
    retcode = p.launch()
  File "./pgap.py", line 477, in launch
    self.record_runtime(f)
  File "./pgap.py", line 406, in record_runtime
    result = subprocess.run(cmd, stdin=subprocess.DEVNULL, check=True, stdout=subprocess.PIPE)
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/opt/ohpc/pub/libs/singularity/3.4.1/bin/singularity', 'exec', '--bind', '/tools/software/bioinfo-tools/pipelines/pgap:/cwd:ro', '/tools/software/bioinfo-tools/pipelines/pgap', 'bash', '-c', 'df -k /cwd /tmp ; ulimit -a ; cat /proc/{meminfo,cpuinfo}']' returned non-zero exit status 255.

I get a successful run if I exclude the container-path option and start the pipeline from within the installation directory in which also the SIF file resides. (/tools/software/bioinfo-tools/pipelines/pgap)

./pgap.py -v -c 8 -d -n --no-self-update --auto-correct-tax -o /work/vdejager/test/test_10012023 /work/vdejager/test/input.yaml
azat-badretdin commented 1 year ago

Vic,

--container-path /tools/software/bioinfo-tools/pipelines/pga

is not the path to the image, it's a path to a directory that (presumably) contains the image). Please specify the path to the file, not to the directory.

vdejager commented 1 year ago

Thanks, I was thrown off by the fact that the image was found and the help text "Specify a container name that will be used instead of automatically generated." in the --container-name option. I assumed the container name was reconstructed from the pgap.py version.

azat-badretdin commented 1 year ago

I see. --container-name is a decorative option basically. You can see it in the list of currently running containers under docker if you have a large system where you have plenty of docker users. How it works for singularity - I do not know from the top of my head. Maybe it is not used there.

TL; DR. --container-name is purely decoracive option

vdejager commented 1 year ago

ok, it works now. Thanks

azat-badretdin commented 1 year ago

Glad to hear that, Vic. Just some more clarification on containerology here:

Fact number one. Current definitions of "container" and "image" are a bit counterintuitive. We are used for "image" being an "image" in memory of an application stored somewhere on disk. In Docker, it's the "template" (the one that PGAPx downloads and installs as a template on your local computer). "container" which seems more "permanent" is actually an instantiation of "image" in memory during a "docker run" command that we spawn from ./pgap.py.

So when we say container name it means the decorative name of the application running in memory performing a specific task or tasks of the user using "image" as a template.

Hope this helps.