singularityhub / singularity-hpc

Local filesystem registry for containers (intended for HPC) using Lmod or Environment Modules. Works for users and admins.
https://singularity-hpc.readthedocs.io
Mozilla Public License 2.0
111 stars 26 forks source link

GitLab local registry support #471

Closed nrcfieldsa closed 2 years ago

nrcfieldsa commented 2 years ago

Is it currently supported to use a GitLab repository or container registry on a private network with shpc? Is it possible with a config to update the base repository url?

In an HPC project at our organization it is being target to push and pull singularity containers to a gitlab server, using shpc and environment modules to provide simplified user access to software on the cluster. However some of the containers are to be stored on the private registry; while others can be used directly from public registries.

I am interested in other users opinion if this is a good feature to request or contribute. This project seems to meet a number of sought-after features and efforts are appreciated.

So far, in testing, an attempt is made to update the shpc python files in install path for two variables that seem static defined (cannot find config option: https://singularity-hpc.readthedocs.io/en/latest/getting_started/user-guide.html#id3 ):

$ grep -nRi github_url /usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc
/usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc/defaults.py:15:# The GitHub repository with recipes
/usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc/defaults.py:16:github_url = "https://github.com/singularityhub/singularity-hpc"
..
$ grep -A5 -B2 -n 'url =' /usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc/main/container.py
78-
79-        # Assemble the artifact url
80:        #url = "https://github.com/%s/releases/download/%s/%s.%s.sif" % (
81:        url =  "https://gitlabxyz.nrc-cnrc.gc.ca/singularity-hpc/%s/%s.%s.sif" % (
82-            repo,
83-            github_tag,
84-            prefix,
85-            container_tag,
86-        )

Config options that exist for the base directory paths are local to the host running shpc, for module_base, container_base and namespace options. Thus, setting these doesn't seem to help with our repository access.

vsoch commented 2 years ago

Is it currently supported to use a GitLab repository or container registry on a private network with shpc? Is it possible with a config to update the base repository url?

If you have permission to pull the container outside of shpc (e.g., with singularity natively), it should be possible - so the first thing I would do is produce a singularity command that successfully pulls, show that to me, and then we can figure out the best approach to take.

nrcfieldsa commented 2 years ago

Just as a test we have a CentOS 7 container with miniconda installed and I can pull it as follows:

$ singularity pull docker://registry.azurecr.io/miniconda:latest

A bit further info on what seems to work to some extent: there was no '@' support in podman on this RHEL8 compatible Almalinux test VM where I'm running singularity, docker and shpc commands. It seems that docker-pull(1) commands are aliased to the podman-pull(1) commands, but lack certain syntax. Thus, I've had to use the following settings to get shpc install to recognize my docker private registry.

$ shpc config set gh:gitlabxyz.nrc-cnrc.gc.ca/singularity-hpc/     
$ shpc config set docker:nrc-private-regsitry.azurecr.io

$ diff -u /usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc{,.bak}/main/modules.py
--- /usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc/main/modules.py        2022-01-19 19:34:45.675694606 -0500
+++ /usr/local/lib/python3.6/site-packages/singularity_hpc-0.0.24-py3.6.egg/shpc.bak/main/modules.py    2022-01-11 20:45:46.679342526 -0500
@@ -374,7 +374,7 @@

         # We pull by the digest
         if pull_type == "docker":
-            container_uri = "docker://%s:%s" % (config.docker, tag.digest)
+            container_uri = "docker://%s@%s" % (config.docker, tag.digest)
         elif pull_type == "gh":
             container_uri = "gh://%s/%s:%s" % (config.gh, tag.digest, tag.name)

I also had to convert the container from singularity to docker in order to populate as a docker format image. Otherwise, it would complain about the wrong signature/checksum during singularity pull docker://registry even though there was an SIF image upload to the same container name/tag. It just so happens that the registry supports both docker:// and oras:// (OCI) protocols, but you need to store the images separately after conversion.

The container.yaml file used was similar to:

docker: registry.azurecr.io/miniconda
#url: docker://registry.azurecr.io/miniconda:latest
maintainer: '@fieldsa'
description: miniconda
latest:
  0.0.1: "latest"
tags:
  0.0.1: "latest"
filter:
  - "0.0.*"
aliases:
  miniconda: /singularity

It doesn't seem the url: is used and is just a comment, perhaps in a future release it could take this parameter from the file and populate the docker:// host string instead of using the docker: / line. Perhaps it completely ignores the config for docker and gh and just use the string provided in container.yaml instead.

Generally this is encouraging as it's going in the right direction. I can load the module and run miniconda from the Singularity command line in the container.

vsoch commented 2 years ago

So - you shouldn't need to do any more than create a subfolder under registry with the namespace you want, e.g.:

registry/
   registry.azurecr.io/
       miniconda/

and then write a container.yaml in that folder - I think you were close but you need the actual digests for the tags.

docker: registry.azurecr.io/miniconda
url: "https:/...." # some web page for the container
maintainer: '@nrcfieldsa'
description: The container description.
latest:
  latest: sha256:xxxxxxxxxxx
tags:
  latest: sha256:xxxxxxxxxxx
aliases:
  miniconda: /singularity

And make sure to get the correct hashes, and inspect the container inside for the aliases you want to expose. This is a docker pull so I'm not sure why you are trying to edit GitHub / GitLab?

These two settings don't mean anything (there is no docker or gh setting in settings)

$ shpc config set gh:gitlabxyz.nrc-cnrc.gc.ca/singularity-hpc/     
$ shpc config set docker:nrc-private-regsitry.azurecr.io

We could definitely add an oras:// uri for images, but that sounds like a different issue than what we have here.

vsoch commented 2 years ago

It doesn't seem the url: is used and is just a comment

The URL is a user friendly URL for the image. E.g., for a docker URI it would be the page on Docker Hub. It's intended only to be parsed into the docs for a human and is not used to derive the image name. The image name belongs entirely under docker:// given a docker URI.

vsoch commented 2 years ago

Otherwise, it would complain about the wrong signature/checksum during singularity pull docker://registry even though there was an SIF image upload to the same container name/tag. It just so happens that the registry supports both docker:// and oras:// (OCI) protocols, but you need to store the images separately after conversion.

We could easily add an oras:// endpoint / puller if native SIF is your preference.

nrcfieldsa commented 2 years ago

Thanks for clarifying the registry formats and configs.

The reason I was attempting to also set a github_url, was so that if the container.yaml file is pointing to a gh: project - that project could contain the SIF images (git commit && git push), versus using the docker private registry (docker push) or even singularity push where supported by OCI.

We can do either, and as a matter of flexibility I'd like to test both. However, using one alone is enough for most requirements, as we can instruct users which registry to upload to, in order for it to be a simple: module load # on hpc clusters. This will prove handy for in-house code that isn't yet released for public use on GitHub for example, where a container image could be produced an put into standard registries like DockerHub (I think one has been already).

I agree that it would be a great contribution for adding oras:// endpoint - native SIF is a preference since then we could leverage images being produced by researchers on machines w/o root singularity build --fakeroot followed by a singularity push oras://registry

Just to mention: Provided individuals can use the main GitHub repository or setup their own GitLab hooks.. Doing an end-to-end build on the GitHub project with singularity-deploy is a great idea, as then all that is needed is a commit with the latest Dockerfile or Definition file. If I read it correctly the SIF file is produced during the merge request and placed into the /releases on GitHub project, automatically stored with version and release.

vsoch commented 2 years ago

The reason I was attempting to also set a github_url, was so that if the container.yaml file is pointing to a gh: project - that project could contain the SIF images (git commit && git push), versus using the docker private registry (docker push) or even singularity push where supported by OCI.

If you want to use SIF we should add an oras endpoint proper. The original gh URI was created before packages supported SIF, and it would allow pulling a SIF from within an official release using a tool called Singularity Deploy (as you noted above!). What you are trying to do - SIF via oras from GitHub, is not currently supported. But I think we should 100% have it, because certainly I do a lot of deployment there too! The only drawback with using oras / SIF over docker (and pulling to Singularity) is that you cannot pull a SIF with Docker. I think Podman might eventually have support though.

If this sounds good to you I'll add to my TODO to work on - this isn't a work project so likely I'll find some time an evening this week, and latest over the weekend. Let me know if that would be ok!

nrcfieldsa commented 2 years ago

Certainly. I appreciate your efforts and look forward to contributing back what I can, if only in terms of testing / updating ticket with results or further steps.

vsoch commented 2 years ago

Will do ! I'll get started asap - I don't think it should be too much work, just need to make sure to add the proper testing and whatnot. Will ping you when I have something to try!

vsoch commented 2 years ago

okay @nrcfieldsa I have a PR ready for you!

https://github.com/singularityhub/singularity-hpc/pull/472

Note that you'll need to install shpc from that branch, and also do pip install -e . again to get a newly released singularity python (spython) as the previous didn't have support for oras. I've also added an example ghcr.io image: https://github.com/singularityhub/github-ci/pkgs/container/github-ci and you should be able to try shpc install ghcr.io/singularityhub/github-ci.

Try it out, tweak the recipe for your use case, and let me know what further steps we need to work on. I'll be back tomorrow evening for another round.

nrcfieldsa commented 2 years ago

Success: test install from branch add/oras using shpc-0.0.39 (PR#472).

## 1. Use shpc-0.0.39 on base OS (AlmaLinux 8.5/Python 3.6.8/Singularity 3.8.5-2.el8/Podman 3.3.1):
######
[nrcfieldsa@container-test singularity-hpc]$ pip3 install -e .   # ok
[nrcfieldsa@container-test singularity-hpc]$ less /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/registry/ghcr.io/singularityhub/github-ci/container.yaml
oras: ghcr.io/singularityhub/github-ci
url: https://github.com/singularityhub/github-ci/pkgs/container/github-ci
maintainer: '@vsoch'
description: An example SIF on GitHub packages to pull with oras
latest:
  latest: sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7
tags:
  latest: sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7

[nrcfieldsa@container-test singularity-hpc]$ shpc install ghcr.io/singularityhub/github-ci
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/ghcr.io/singularityhub/github-ci/la
test/ghcr.io-singularityhub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif oras://
ghcr.io/singularityhub/github-ci@sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7
INFO:    Using cached SIF image
INFO:    Using cached SIF image
/home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularity
hub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif
Module ghcr.io/singularityhub/github-ci:latest was created.

[nrcfieldsa@container-test singularity-hpc]$ export MODULEPATH=$(pwd)/modules
[nrcfieldsa@container-test singularity-hpc]$ module avail

----------------------------- /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules -----------------------------
   ghcr.io/singularityhub/github-ci/latest/module

[nrcfieldsa@container-test singularity-hpc]$ module load ghcr.io/singularityhub/github-ci
[nrcfieldsa@container-test singularity-hpc]$ github-ci-run
Hold me closer... tiny container :) :D

[nrcfieldsa@container-test singularity-hpc]$ github-ci-shell
Singularity>
Singularity> ls -l /singularity
lrwxrwxrwx    1 root     root            24 Oct 23 18:59 /singularity -> .singularity.d/runscript
Singularity> cat /singularity
#!/bin/sh

echo "Hold me closer... tiny container :) :D"

Singularity> exit

## 2. Use the included ./Dockerfile to test running shpc-0.0.39 inside a singularity container
#####

# build a container with lmod for singularity-hpc
[nrcfieldsa@container-test singularity-hpc]$ docker build -t shpc-oras-docker.tar .
[nrcfieldsa@container-test singularity-hpc]$ docker save -o shpc-oras.sif localhost/shpc-oras-docker.tar .

[nrcfieldsa@container-test singularity-hpc]$ singularity build -s shpc-oras shpc-oras.sif
INFO:    Starting build...
INFO:    Verifying bootstrap image shpc-oras.sif
WARNING: integrity: signature not found for object group 1
WARNING: Bootstrap image could not be verified, but build will continue.
INFO:    Creating sandbox directory...
INFO:    Build complete: shpc-oras

[nrcfieldsa@container-test singularity-hpc]$ singularity shell -w shpc-oras/
Singularity> shpc --version
0.0.39
Singularity> ls
CHANGELOG.md  Dockerfile      LICENSE      README.md  entrypoint.sh  registry  shpc       shpc-oras-docker.tar
CITATION.cff  Dockerfile.tcl  MANIFEST.in  docs       paper          setup.py  shpc-oras  shpc-oras.sif
Singularity> touch /code/test # writable? ok
Singularity> rm /code/test

#####################################################################################################
Singularity> shpc install ghcr.io/singularityhub/github-ci
singularity pull --name /code/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularityhub-github-ci-latest-sha256:227
a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif oras://ghcr.io/singularityhub/github-ci@sha256:227a917e9ce3a6e
1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7
INFO:    Downloading oras image
INFO:    Downloading oras image
/code/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularityhub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361
865ca92f3147fd202fa1b2e6a7a8220d510b7.sif
Module ghcr.io/singularityhub/github-ci:latest was created.
Singularity> shpc --version
0.0.39
Singularity> shpc show ghcr.io/singularityhub/github-ci
oras: ghcr.io/singularityhub/github-ci
url: https://github.com/singularityhub/github-ci/pkgs/container/github-ci
maintainer: '@vsoch'
description: An example SIF on GitHub packages to pull with oras
latest:
  latest: sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7
tags:
  latest: sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7
######################################################################################################

Singularity> module load ghcr.io/singularityhub/github-ci
bash: module: command not found
Singularity> . /etc/profile.d/modules.sh
Singularity> export MODULEPATH=/code/modules:$MODULEPATH
Singularity> module avail

-------------------------------------------------------- /code/modules ---------------------------------------------------------
   ghcr.io/singularityhub/github-ci/latest/module

Singularity> module load ghcr.io/singularityhub/github-ci
[nrcfieldsa@container-test singularity-hpc]$ set|less
..
github-ci-inspect-deffile ()
{
    singularity ${SINGULARITY_OPTS} inspect ${SINGULARITY_COMMAND_OPTS} -d /code/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularityhub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif
}
github-ci-inspect-runscript ()
{
    singularity ${SINGULARITY_OPTS} inspect ${SINGULARITY_COMMAND_OPTS} -r /code/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularityhub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif
}
github-ci-run ()
{
    singularity ${SINGULARITY_OPTS} run ${SINGULARITY_COMMAND_OPTS} -B /code/modules/ghcr.io/singularityhub/github-ci/latest/99-shpc.sh:/.singularity.d/env/99-shpc.sh /code/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularityhub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif $@
}
github-ci-shell ()
{
    singularity ${SINGULARITY_OPTS} shell ${SINGULARITY_COMMAND_OPTS} -s /bin/sh -B /code/modules/ghcr.io/singularityhub/github-ci/latest/99-shpc.sh:/.singularity.d/env/99-shpc.sh /code/modules/ghcr.io/singularityhub/github-ci/latest/ghcr.io-singularityhub-github-ci-latest-sha256:227a917e9ce3a6e1a3727522361865ca92f3147fd202fa1b2e6a7a8220d510b7.sif
}

For next test: I will run shpc config again for my local registry and try with a private container registry using OCI.

vsoch commented 2 years ago

Sounds good!

For next test: I will run shpc config again for my local registry and try with a private container registry using OCI.

You shouldn't need to use shpc config for anything - you just need to create the corresponding container.yaml and folder that has the registry URI there.

nrcfieldsa commented 2 years ago

The issue I get in my test with private OCI container registry is as follows:

[nrcfieldsa@container-test singularity-hpc]$ cat registry/lolnrc/container.yaml
oras: registry.azurecr.io/lolnrc
url: oras://registry.azurecr.io/lolnrc:
maintainer: '@nrcfieldsa'
description: Cowsay NRC
latest:
  latest: "sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9"
tags:
  latest: "sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9"
aliases:
  lolnrc: /singularity

[nrcfieldsa@container-test singularity-hpc]$ shpc install lolnrc
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/registry.azurecr.io/lolnrc/latest/registry.azurecr.io-lolnrc-latest-sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9.sif oras://registry.azurecr.io/lolnrc@sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9: while resolving reference: pulling from host registry.azurecr.io failed with status code [manifests sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9]: 500 Internal Server Error
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9: while resolving reference: pulling from host registry.azurecr.io failed with status code [manifests sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9]: 500 Internal Server Error
There was an issue pulling None

[nrcfieldsa@container-test singularity-hpc]$ cat registry/lolnrc/container.yaml
oras: registry.azurecr.io/lolnrc
url: oras://registry.azurecr.io/lolnrc:latest
maintainer: '@nrcfieldsa'
description: Cowsay NRC
latest:
  0.0.1: latest
tags:
  0.0.1: latest
aliases:
  lolnrc: /singularity

[nrcfieldsa@container-test singularity-hpc]$ shpc install lolnrc
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/registry.azurecr.io/lolnrc/0.0.1/registry.azurecr.io-lolnrc-0.0.1-latest.sif oras://registry.azurecr.io/lolnrc@latest
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@latest: while resolving reference: invalid checksum digest format
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@latest: while resolving reference: invalid checksum digest format
There was an issue pulling None

# when I pushed the container as a test:
$ sha256sum lolnrc.sif
d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9 lolnrc.sif
$ singularity push lolnrc.sif oras://registry.azurecr.io/lolnrc:latest
INFO:    Upload complete
$ singularity push lolnrc.sif oras://registry.azurecr.io/lolnrc@sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0
107881f83e255dce7c083c9
INFO:    Upload complete

I don't think my version of singularity likes the @latest but needs :latest instead to d/l ok with-out running into a checksum error. It doesn't seem to pull OK with the sha256 schema even if singularity is called directly and attempts to parse the checksum value when specifying @sha256:...


# pull with "latest":
[nrcfieldsa@container-test registry]$ singularity pull --disable-cache --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/registry.azurecr.io/lolnrc/0.0.1/registry.azurecr.io-lolnrc-0.0.1-latest.sif oras://registry.azurecr.io/lolnrc@latest
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@latest: while resolving reference: invalid checksum digest format
[nrcfieldsa@container-test registry]$ singularity pull --disable-cache --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/registry.azurecr.io/lolnrc/0.0.1/registry.azurecr.io-lolnrc-0.0.1-latest.sif oras://registry.azurecr.io/lolnrc:latest
INFO:    Downloading oras image

# pull with sha256:
[nrcfieldsa@container-test singularity-hpc]$ singularity pull --disable-cache --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/registry.azurecr.io/lolnrc/0.0.1/registry.azurecr.io-lolnrc-0.0.1-latest.sif oras://registry.azurecr.io/lolnrc:sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc:sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9: while resolving reference: pulling from host registry.azurecr.io failed with status code [manifests sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9]: 500 Internal Server Error
[nrcfieldsa@container-test singularity-hpc]$ singularity pull --disable-cache --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/registry.azurecr.io/lolnrc/0.0.1/registry.azurecr.io-lolnrc-0.0.1-latest.sif oras://registry.azurecr.io/lolnrc@sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9: while resolving reference: pulling from host registry.azurecr.io failed with status code [manifests sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9]: 500 Internal Server Error

Arguably GitHub may make a more friendly backend - however, is this something to do with OCI artifacts standard when specifying '@', and not specific to the registry in question or singularity CLI?

nrcfieldsa commented 2 years ago

You shouldn't need to use shpc config for anything - you just need to create the corresponding container.yaml and folder that has the registry URI there.

Note my test was done with-out changing any config values and just adding the container.yaml into the checked-out project ./registry dir.

vsoch commented 2 years ago

Can you show me the pull command that works?

I've actually run into that issue too with tags - if you don't provide a tag it messes things up a bit.

vsoch commented 2 years ago

For pull we usually default to adding the shasum (akin to docker) but if that doesn't work here we can skip it (and likely we cannot use it for pull).

nrcfieldsa commented 2 years ago

Can you show me the pull command that works?

I've actually run into that issue too with tags - if you don't provide a tag it messes things up a bit.

Yes, I've discovered why. The slow motion replay is..

working pull command:

$ singularity pull oras://registry.azurecr.io/lolnrc:latest
INFO:    Downloading oras image
$ ls -l lolnrc_latest.sif
-rwxr-xr-x. 1 nrcfieldsa nrcfieldsa 98254848 Jan 20 19:23 lolnrc_latest.sif

Any combo using the @sha256sum and images checksum didn't work with SIF checksum. the OCI project [https://github.com/oras-project/oras/blob/4e0d1e2bd8f00254977a5a0d0cb87a583ea5eecd/cmd/oras/pull.go#L42] says you can have either a :tag or an @digest:

42      Use:   "pull <name:tag|name@digest>",
43      Short: "Pull files from remote registry",
44      Long: `Pull files from remote registry

However, on the registry side I see a different checksum for the entire image since it adds a layer with JSON meta-data after I push the SIF image.

lolnrc:latest
sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17

Repository: lolnrc
Tag: latest
Tag creation date: 1/20/2022, 6:17 PM EST
Tag last updated date: 1/20/2022, 6:17 PM EST
Digest: sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
Manifest creation date: 1/20/2022, 6:17 PM EST
Platform: -

Artifact reference: registry.azurecr.io/lolnrc:latest
Manifest: {
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.unknown.config.v1+json",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2
  },
  "layers": [
    {
      "mediaType": "application/vnd.sylabs.sif.layer.v1.sif",
      "digest": "sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9",
      "size": 98254848,
      "annotations": {
        "org.opencontainers.image.title": "lolnrc.sif"
      }
    }
  ]
}

The digest matches on the layer with the SIF file, so when I test with the checksum from all layers I get:

[nrcfieldsa@container-test registry]$ rm lolnrc_latest.sif; singularity pull --disable-cache oras://registry.azurecr.io/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
INFO:    Downloading oras image
[nrcfieldsa@container-test registry]$ ls -l lolnrc*.sif
-rwxr-xr-x.  1 nrcfieldsa nrcfieldsa 98254848 Jan 20 19:46 lolnrc@sha256_91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif
[nrcfieldsa@container-test registry]$ module load registry.azurecr.io/lolnrc
[nrcfieldsa@container-test registry]$ lolnrc
 __________
< NRC-CNRC >
 ----------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
vsoch commented 2 years ago

oh interesting! So when you add the correct digest to your container.yaml, does the install then work?

nrcfieldsa commented 2 years ago

For pull we usually default to adding the shasum (akin to docker) but if that doesn't work here we can skip it (and likely we cannot use it for pull).

When the correct digest is provided this is working fine.

The problem is not with the singularity-hpc code, nor singularity pull with the OCI registry used in the cloud, but with a differing checksum in the repository, when you are using ACR (MS Azure Container Registry product). Users of this particular container registry, can take a note that presently they will need to discover the final checksum of their OCI artifacts with meta-data included.

I would say my final test concludes that all the code in PR#472 works as expected. (I appreciate your work on this.)

nrcfieldsa commented 2 years ago

oh interesting! So when you add the correct digest to your container.yaml, does the install then work?

Yes it does. It might make an argument for some users to stick with a GitHub/GitLab project in case it makes the checksum situation more automated during publishing of SIF images to the repository.

vsoch commented 2 years ago

Hmm interesting. So would you say it's worth not looking at the digest and pulling verbatim via the tag? it's a moving target but it's likely less error prone.

vsoch commented 2 years ago

If you'd like to to this instead I can update the PR (and test one more time) and then good to go. Otherwise, if you think this current setup with the correct sha is better, we can perhaps add some kind of notes to the docs to warn people about it.

nrcfieldsa commented 2 years ago

Hmm interesting. So would you say it's worth not looking at the digest and pulling verbatim via the tag? it's a moving target but it's likely less error prone.

Yes, it might be a good idea to provide that option with-out changing the checksum behaviour. Perhaps a warning that the SIF image checksum is not validated when using a simple tag name or non-conformant checksum type/length in that string.

To support using a tag name with-out singularity and/or podman attempting to use checksums - in the container.yaml when it doesn't list "sha..." or similar checksum looking string - will need to pass a command line with :tag vs @tag as follows:

$ shpc install lolnrc
singularity pull --name registry/lolnrc/0.0.1/registry-lolnrc-0.0.1-latest.sif oras://registry/lolnrc:string

Tag was being treated as a checksum, though it's length and type don't match. If using @tag that is not a checksum, singularity command fails while pulling the image:

$ shpc install lolnrc
singularity pull --name registry/lolnrc/0.0.1/registry-lolnrc-0.0.1-tag.sif oras://registry/lolnrc@tag
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://registry.azurecr.io/lolnrc@tag: while resolving reference: invalid checksum digest format
..

This was the error that originally had myself confused if it was SHPC or not.

vsoch commented 2 years ago

How about we try digest first and fall back to tag with a warning?

vsoch commented 2 years ago

Can you tell me more detail about the issue so I have something to write into the docs?

nrcfieldsa commented 2 years ago

If you'd like to to this instead I can update the PR (and test one more time) and then good to go. Otherwise, if you think this current setup with the correct sha is better, we can perhaps add some kind of notes to the docs to warn people about it.

I'd say the PR could be approved with a note in the docs to warn about this issue of a tag vs. checksum. There's no reason to hold-up the merge as it presently works properly. If you want to include another change prior to the merge to handle falling back to a tag that is fine, but I don't think it's a requirement for the original work described.

vsoch commented 2 years ago

okay, that sounds good! What I'dd do is add a note to the docs, and if I start to see many people having issues I'll do the fallback.

nrcfieldsa commented 2 years ago

One note: Prior to requesting to close issue#471..

Further testing can be performed by myself and would be needed to confirm the original targeted GitLab option: if the GitLab project with SIF image git push'ed to files (and/or in releases dir) of project, can also be leveraged in addition to singularity/docker container registry - as stated for those that don't have a registry handy..

As noted earlier, these container technology differ in function and I'd also like to see if the original title of the ticket can work for myself with our GitLab instance. If I need to test a few changes, I could explore the code a bit and get back to you.

I think it adds value to this tool to have flexibility in source of containers, as it largely depends on the environment which tools can be leveraged for the registry, when they cannot just push to the public repositories in all cases. Theoretically, they could install a particular tool on a local machine, but the audience is sure to be wider if they can use what's readily available to them to host their images for a particular project - such as in the common case that git is used, to facilitate access across multiple installations / clusters.

vsoch commented 2 years ago

Sure, happy to keep the issue opened - let me know what you'd like to test / try next, or just open a PR if you want to suggest changes!

I believe GitLab has an equivalent registry that can handle oras, so I would try to use that before GitHub and artifacts. That really was more of a hack than anything else.

I think it adds value to this tool to have flexibility in source of containers, as it largely depends on the environment which tools can be leveraged for the registry, when they cannot just push to the public repositories in all cases. Theoretically, they could install a particular tool on a local machine, but the audience is sure to be wider if they can use what's readily available to them to host their images for a particular project - such as in the common case that git is used, to facilitate access across multiple installations / clusters.

Indeed it would be interesting to think about an easy build -> deploy pipeline - right now we require the container.yaml recipe but arguably there could be a faster:

$ shpc install oras://<>

The reason I haven't done that is because I want to encourage users to contribute their container recipes, and to take the few seconds to write the container.yaml to add needed entrypoints, etc.

nrcfieldsa commented 2 years ago

The reason I haven't done that is because I want to encourage users to contribute their container recipes, and to take the few seconds to write the container.yaml, to add needed entrypoints, etc.

I agree. It is helpful to contribute recipes to upstream where possible. The endpoints feature is really handy and makes this solution useful for bridging between traditional hpc modules and containers.

Indeed it would be interesting to think about an easy build -> deploy pipeline - right now we require the container.yaml recipe but arguably there could be a faster:

$ shpc install oras://<>

Perhaps, in this case (directly pulling a container) it would be possible to store the yaml specifics in a layer of the SIF image, prior to populating registry - so that it will include all of the entry points and command mappings, or default otherwise to best know target for docker/singularity (where not present).

I believe GitLab has an equivalent registry that can handle oras, so I would try to use that before GitHub and artifacts.

I will give it a further try on port 4000 after opening firewall flow. I was originally under the impression that GitLab container registry only support docker format images, but it appears that other documentation also suggests this may be possible. I did manage to singularity push oras:// on port 443 using an access token - but I cannot see the uploaded image in GitLab anywhere, likely because it has to communicate on port 4000 in order to get to the container registry enabled for this project on the host.

On the topic of host specification. It might be ideal to assume mainline GitHub.com project by default and then to provide an option to over-ride the base URL for GitLab users.

Here is the next slightly cryptic error output - with an incorrect container.yaml file -- which is not being used as expected by shpc, but rather presumably as intended by the admin (me). If I was to point to an SIF image in the project's files, where should I put it for shpc to find it?

$ cat lolnrc/container.yaml
#gh: gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc
#url: https://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc
gh: gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/-/raw/main/lolnrc
url: https://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/-/raw/main
maintainer: '@nrcfieldsa'
description: Cowsay NRC from GitLab project file lolnrc.sif
latest:
  latest: "sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9"
tags:
  latest: "sha256:d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9"
aliases:
  ghnrc: /singularity

$ shpc install ghnrc
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/-/raw/main/lolnrc/latest/gitlabxyz.nrc-cnrc.gc.ca-nrcfieldsa-singularity-test---raw-main-lolnrc-latest-sha256:
d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9.sif https://github.com/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/releases/download/singularity-test/-/raw/main/lolnrc/sha256/gitlabxyz.nrc-cnrc.gc.ca-nrcfieldsa.d652da62c85a2d415c8e2d44a2547e14aa66aa8a0107881f83e255dce7c083c9:latest.sif
INFO:    Downloading network image
FATAL:   the requested image was not found
INFO:    Downloading network image
FATAL:   the requested image was not found
There was an issue pulling None

The problem for my original usage case (which reading the code reveals is not a good idea in how I attempt it currently): is that the shpc tool is still trying to get to github.com, even if I prefix the gh: pkg with string gh: hostname.tld/project/pkg to the exact path on the repository for the download.

It does something a bit different than wanted (specifying the download URL to gitlab project). I could always push the SIF images organized in the expected path with release and checksum in the filename - but it still needs to get to the correct GitLab server to switch on local site usage.

vsoch commented 2 years ago

Perhaps, in this case (directly pulling a container) it would be possible to store the yaml specifics in a layer of the SIF image, prior to populating registry - so that it will include all of the entry points and command mappings, or default otherwise to best know target for docker/singularity (where not present).

If you are interested in this check out the scientific filesystem - it lets you build a container with multiple entrypoints. https://sci-f.github.io. It's native to Singularity and I made separate clients too (e.g., so you can use in Docker).

I did manage to singularity push oras:// on port 443 using an access token - but I cannot see the uploaded image in GitLab anywhere, likely because it has to communicate on port 4000 in order to get to the container registry enabled for this project on the host.

I have actually pushed SIF to GitLab, and it had the same issue with kind of freaking out without a tag. But even if the UI is buggy I could still pull it!

On the topic of host specification. It might be ideal to assume mainline GitHub.com project by default and then to provide an option to over-ride the base URL for GitLab users.

This is also defined in the image URI, e.g., ghcr.io would be GitHub container registry, I think GitLab has something similar. I would personally use GitHub (I like it better) but that's just me!

Here is the next slightly cryptic error output - with an incorrect container.yaml file -- which is not being used as expected by shpc, but rather presumably as intended by the admin (me). If I was to point to an SIF image in the project's files, where should I put it for shpc to find it?

I think maybe you forgot our conversation? gh: is not the field you want, nor is URL. You want oras.

nrcfieldsa commented 2 years ago

This is also defined in the image URI, e.g., ghcr.io would be GitHub container registry, I think GitLab has something similar. I would personally use GitHub (I like it better) but that's just me!

OK, got it - change from gh: to oras: at the beginning of line and retest.

Here is the next slightly cryptic error output - with an incorrect container.yaml file -- which is not being used as expected by > > shpc, but rather presumably as intended by the admin (me). If I was to point to an SIF image in the project's files, where should I put it for shpc to find it?

I think maybe you forgot our conversation? gh: is not the field you want, nor is URL. You want oras.

I can reach port 4000 now to test.

When using oras: gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/ I am now getting an authentication error despite first doing a singularity remote add and singularity remote login -- so I'll need to straighten that out first to make sure this is working.

time="2022-01-25T20:40:44-05:00" level=info msg="trying next host" error="pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed" host="gitlabxyz.nrc-cnrc.gc.ca:4000"
vsoch commented 2 years ago

Ah interesting! See if you can get it to work with singularity on the command line. If we need to tweak the singularity command (e.g., ensure that credentials are there) that is something we can do. I was hoping the client would just find them in the environment (make sure they are exported).

nrcfieldsa commented 2 years ago

Now that I have it working with singularity on command line. It also works with the shpc install command.

$ singularity push /tmp/miniconda_latest.sif oras://gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/miniconda@sha256:12bdd0d52ad293143d77cf9b8dc098343e7c487a247426a02bee95206e753636
INFO:    Upload complete
$ singularity shell --disable-cache oras://gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/miniconda:latest
INFO:    Downloading oras image to tmp cache: /tmp/sbuild-tmp-cache-008516530
INFO:    Downloading oras image
Singularity> conda --version
conda 4.8.2
Singularity> exit

[nrcfieldsa@container-test singularity-hpc]$ shpc install miniconda
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/4000/nrcfieldsa/singularity-test/miniconda/latest/gitlabxyz.nrc-cnrc.gc.ca:4000-nrcfieldsa-singularity-test-miniconda-latest-sha256:3519e23352927a34f8dde704d79ce1d45d63f2be29a36807c2cc16116216d6f9.sif oras://gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/miniconda@sha256:3519e23352927a34f8dde704d79ce1d45d63f2be29a36807c2cc16116216d6f9
INFO:    Using cached SIF image
INFO:    Using cached SIF image
/home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/4000/nrcfieldsa/singularity-test/miniconda/latest/gitlabxyz.nrc-cnrc.gc.ca:4000-nrcfieldsa-singularity-test-miniconda-latest-sha256:3519e23352927a34f8dde704d79ce1d45d63f2be29a36807c2cc16116216d6f9.sif
Module miniconda:latest was created.
[nrcfieldsa@container-test singularity-hpc]$ module avail miniconda

---------------------------------------------- ./modules -----------------------------------------------
   gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/miniconda/latest/module

The port 4000 however throws it for a bit of a loop:

[nrcfieldsa@container-test singularity-hpc]$ set
..
miniconda-run ()
{
    singularity ${SINGULARITY_OPTS} run ${SINGULARITY_COMMAND_OPTS} -B /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/miniconda/latest/99-shpc.sh:/.singularity.d/env/99-shpc.sh /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/4000/nrcfieldsa/singularity-test/miniconda/latest/gitlabxyz.nrc-cnrc.gc.ca:4000-nrcfieldsa-singularity-test-miniconda-latest-sha256:3519e23352927a34f8dde704d79ce1d45d63f2be29a36807c2cc16116216d6f9.sif $@
}
miniconda-shell ()
{
    singularity ${SINGULARITY_OPTS} shell ${SINGULARITY_COMMAND_OPTS} -s /bin/sh -B /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/miniconda/latest/99-shpc.sh:/.singularity.d/env/99-shpc.sh /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/4000/nrcfieldsa/singularity-test/miniconda/latest/gitlabxyz.nrc-cnrc.gc.ca:4000-nrcfieldsa-singularity-test-miniconda-latest-sha256:3519e23352927a34f8dde704d79ce1d45d63f2be29a36807c2cc16116216d6f9.sif
}
[nrcfieldsa@container-test registry]$ miniconda-shell
FATAL:   while parsing bind path: while getting bind path:  is not a valid bind option
vsoch commented 2 years ago

Why did you add 4000 to a directory bind?

vsoch commented 2 years ago

The reason it doesn't work is because the bind syntax expects a : to separate the source from dest, and when you add the port it parses as such. Are you not able to provide a full domain name without a custom port?

nrcfieldsa commented 2 years ago

Why did you add 4000 to a directory bind?

Doesn't work with-out it.

The reason it doesn't work is because the bind syntax expects a : to separate the source from dest, and when you add the port it parses as such. Are you not able to provide a full domain name without a custom port?

If I try with-out the port 4000, it does not get to the "container registry" but to the "repository" login-page on port 443 (despite specifying oras://) and therefore asks for a login or gives 404 error instead.

[nrcfieldsa@container-test ghnrc]$ singularity pull oras://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/miniconda:latest
21:41:55.715838 IP 10.24.w.z.50910 > 10.0.x.y.https: Flags [S], seq 236244380, win 29200, options [mss 1460,sackOK,TS val 1137606780 ecr 0,nop,wscale 7], length 0
21:41:56.358373 IP 10.0.x.y.https > 10.24.w.z.50910: Flags [P.], seq 15932:18185, ack 953, win 243, options [nop,nop,TS val 229602026 ecr 1137607325], length 2253
21:41:56.358382 IP 10.24.w.z.50910 > 10.0.x.y.https: Flags [.], ack 18185, win 546, options [nop,nop,TS val 1137607423 ecr 229602026], length 0
21:41:56.358897 IP 10.24.w.z.50910 > 10.0.x.y.https: Flags [P.], seq 953:988, ack 18185, win 546, options [nop,nop,TS val 1137607423 ecr 229602026], length 35
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/miniconda:latest: could not get image manifest, received mediaType: text/html
21:41:56.360673 IP 10.24.w.z.50910 > 10.0.x.y.https: Flags [F.], seq 988, ack 18185, win 546, options [nop,nop,TS val 1137607425 ecr 229602026], length 0

$ curl https://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/miniconda@sha256:12bdd0d52ad293143d77cf9b8dc098343e7c487a247426a02bee95206e753636
<html><body>You are being <a href="https://gitlabxyz.nrc-cnrc.gc.ca/users/sign_in">redirected</a>.</body></html>
vsoch commented 2 years ago

Ah, so I think what we need is a field to add a port, and it will only be used for the pull part (but not the other reference). I'll do a PR for you to test out.

vsoch commented 2 years ago

@nrcfieldsa this is going to need some back and forth because I can't test with your container, but here is start of work for you to test! Let's first see if we can get the pull and then run to work, and then we can move on to other interactions. https://github.com/singularityhub/singularity-hpc/pull/476

nrcfieldsa commented 2 years ago

With the latest changes in this PR, the port spec is handled appropriately for the shell aliases. However, there is one issue - that now the package name is missing in the module string following shpc install.

[nrcfieldsa@container-test singularity-hpc]$ git branch -v
* add/port                cb49165e7 testing parsing port, likely will break some other interactions so needs full testing
[nrcfieldsa@container-test registry]$ shpc --version
0.0.4
[nrcfieldsa@container-test registry]$ cat lolnrcoras/container.yaml
oras: gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/lolnrc
url: https://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/
maintainer: '@nrcfieldsa'
description: Cowsay NRC from GitLab
latest:
  latest: sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
tags:
  latest: sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
aliases:
  lolnrc: /singularity

[nrcfieldsa@container-test registry]$ shpc install lolnrcoras
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/latest/gitla
bc.nrc-cnrc.gc.ca:4000-nrcfieldsa-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db340726
80d6926e17.sif oras://gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e126255075895
44b4fc90db71db34072680d6926e17
INFO:    Using cached SIF image
INFO:    Using cached SIF image
/home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/latest/gitlabxyz.nrc-cnrc.gc.ca:4000-a
llan.fields-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif
Module lolnrcoras:latest was created.
[nrcfieldsa@container-test registry]$

[nrcfieldsa@container-test singularity-hpc]$ export MODULEPATH=$(pwd)/modules:$MODULEPATH
[nrcfieldsa@container-test singularity-hpc]$ module avail

----------------------------- /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules -----------------------------
   gitlabxyz.nrc-cnrc.gc.ca/latest/module

-------------------------------------------- /usr/share/lmod/lmod/modulefiles/Core ---------------------------------------------
   lmod    settarg

[nrcfieldsa@container-test singularity-hpc]$ module load gitlabxyz.nrc-cnrc.gc.ca/latest/module
[nrcfieldsa@container-test singularity-hpc]$ lolnrc
lolnrc                    lolnrc-inspect-deffile    lolnrc-run
lolnrc-exec               lolnrc-inspect-runscript  lolnrc-shell
[nrcfieldsa@container-test singularity-hpc]$ which lolnrc
lolnrc ()
{
    singularity ${SINGULARITY_OPTS} exec ${SINGULARITY_COMMAND_OPTS} -B /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/latest/99-shpc.sh:/.singularity.d/env/99-shpc.sh /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/latest/gitlabxyz.nrc-cnrc.gc.ca:4000-nrcfieldsa-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif /singularity $@
}
[nrcfieldsa@container-test singularity-hpc]$ lolnrc-shell
Singularity> cowsay "PR#476"
 ________
< PR#476 >
 --------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
Singularity> exit
vsoch commented 2 years ago

@nrcfieldsa could you please provide me a container that I have permission to pull from this address (with the port)? I'm having trouble testing and debugging without being able to reproduce. And then if you could share the container.yaml (the latest) that would be great. Thank you!

vsoch commented 2 years ago

Also this doesn't make sense:

$ shpc install lolnrcoras

The namespace of the container.yaml should match the registry URI. So this should be under a module called:

gitlabxyz.nrc-cnrc.gc.ca/lolnrc/container.yaml

And I agree the module name is missing! Having an image to pull will make this easier to debug.

nrcfieldsa commented 2 years ago

While I'd like to help-out to test..

There is no practical way I can open traffic inbound to the GitLab server on internal network for port 4000 traffic, as it would be against our network policy. However, I have made a small attempt to find the problem parsing the name when the ":port" is in string.

Isolating parse issue to file: shpc/main/client.py :

    142         # If the module name has a tag, only test it
    143         if ":" in module_name:
    144             module_name = module_name.split(":", 1)[0]
    145             tags = [config.tag.name]

/registry/lolnrc/container.yaml:

oras: gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/lolnrc
url: https://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/
maintainer: '@nrcfieldsa'
description: Cowsay NRC from GitLab
latest:
  latest: sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
tags:
  latest: sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
aliases:
  lolnrc: /singularity

Attempt to troubleshoot string split expressions: ./test_uri_split.py:

print 'Test: shpc URI parsing example'
print '-----'

uri="oras://gitlabxyz.nrc-cnrc.gc.ca:4000/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17"

print 'Attempt to parse ":" in string..'
if uri and ":" in uri:
        u = uri.split(":")
        a = u[0]
        b = u[1].replace("/","")
        s = u[2].split("/",1)
        c = s[0]
        d = s[1].split("@")[0]
        e = uri.split("@")[1]

        print('URI:   %s' % uri)
        print('proto: %s' % a)
        print('host:  %s' % b)
        print('port:  %s' % c)
        print('name:  %s' % d)
        print('cksum: %s' % e)

print
print "Show working behaviour.."
module_name="/lolnrc:latest"            # works
if module_name and ":" in uri:
        f = module_name.split(":", 1)[0]
        print('module_name: "%s"' % module_name)
        print('name: "%s"' % f)

print
print "Now attempt with port number in string.."
module_name=":4000/lolnrc@sha256"       # fails
if module_name and ":" in uri:
        f = module_name.split(":", 1)[0]
        print('module_name: "%s"' % module_name)
        print('name: "%s"' % f)

module_name=":4000/lolnrc:latest"       # fails
if module_name and ":" in uri:
        f = module_name.split(":", 1)[0]
        print('module_name: "%s"' % module_name)
        print('name: "%s"' % f)

print
print "Try to fix parsing.."
if module_name and ":" in uri:
        f = module_name.rsplit(":",1)[0].split("/",1)[1]
        print('module_name: "%s"' % module_name)
        print('name: "%s"' % f)

Output:

[nrcfieldsa@container-test shpc]$ python ./test_uri_split.py
Test: shpc URI parsing example
-----
Attempt to parse ":" in string..
URI:   oras://gitlabxyz.nrc-cnrc.gc.ca:4000/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
proto: oras
host:  gitlabxyz.nrc-cnrc.gc.ca
port:  4000
name:  lolnrc
cksum: sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17

Show working behaviour..
module_name: "/lolnrc:latest"
name: "/lolnrc"

Now attempt with port number in string..
module_name: ":4000/lolnrc@sha256"
name: ""
module_name: ":4000/lolnrc:latest"
name: ""

Try to fix parsing..
module_name: ":4000/lolnrc:latest"
name: "lolnrc"

I am not sure if I read the code all correctly, but it seems because of the :4000 string being matched first the left-side expression [0] is "" (null string), as it expects to find the first colon ":" delimiter at the tag-spec end of container name, but instead encounters it first. Can the first slash after the "host:port/" be used to split off the port number at beginning of expression?

vsoch commented 2 years ago

@nrcfieldsa that's definitely where the issue is - but I'm not sure we can easily programatically figure out how to distinguish a name having a tag vs just a port. And then the number of slashes can vary based on the nesting of the name.

I'm sort of thinking we need to take a different approach - adding "port" as a separate field in the container.yaml and then only using it for the pull (and keeping everything else consistent).

vsoch commented 2 years ago

I'll make some time tonight to give this a shot and push a new version to test!

nrcfieldsa commented 2 years ago

I'm sort of thinking we need to take a different approach - adding "port" as a separate field in the container.yaml and then only using it for the pull (and keeping everything else consistent).

Would work well.

I'll make some time tonight to give this a shot and push a new version to test!

OK, much appreciated.

vsoch commented 2 years ago

okay - just pushed a second effort! https://github.com/singularityhub/singularity-hpc/pull/476/files

Take a look at the container.yaml there - you'll need to remove port from the container oras address and add it separately as a field port:. Let me know asap if there are any bugs or issues - I developed this pretty blindly because I don't know of a registry I can use to test that has a port.

nrcfieldsa commented 2 years ago

Take a look at the container.yaml there - you'll need to remove port from the container oras address and add it separately as a field port:. Let me know asap if there are any bugs or issues

There is one issue with a "::" double-colon, due to string concatenation.

[nrcfieldsa@container-test singularity-hpc]$ shpc install lolnrc
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc/latest/gitlabxyz.nrc-cnrc.gc.ca-nrcfieldsa-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif oras://gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17: while resolving reference: address gitlabxyz.nrc-cnrc.gc.ca::4000: too many colons in address
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17: while resolving reference: address gitlabxyz.nrc-cnrc.gc.ca::4000: too many colons in address
There was an issue pulling None

Just to confirm: container.yaml

      1 oras: gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc
      2 port: 4000
      3 url: https://gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/
      4 maintainer: '@nrcfieldsa'
      5 description: Cowsay NRC from GitLab
      ..
vsoch commented 2 years ago

@nrcfieldsa it looks like the module name is messed up too , but let's start with the port. Can you add prints here https://github.com/singularityhub/singularity-hpc/pull/476/files#diff-39abab5db87b7192cc19b4f52519004d2a02e30e4b748a35d883c88651a58a3dR66 so we can figure out where the :: is generated? I can't figure it out just looking at the code.

nrcfieldsa commented 2 years ago

Error message with print statements:

[nrcfieldsa@container-test singularity-hpc]$ shpc install lolnrc
port = :4000
host + : + port + / + rest = gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc/latest/gitlabxyz.nrc-cnrc.gc.ca-nrcfieldsa-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif oras://gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17: while resolving reference: address gitlabxyz.nrc-cnrc.gc.ca::4000: too many colons in address
FATAL:   While pulling image from oci registry: error fetching image to cache: failed to get checksum for oras://gitlabxyz.nrc-cnrc.gc.ca::4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17: while resolving reference: address gitlabxyz.nrc-cnrc.gc.ca::4000: too many colons in address
There was an issue pulling None

Fix is to eliminate the second ":" being added after host:

nrcfieldsa@container-test singularity-hpc]$ git diff -v add/port
diff --git a/shpc/main/container/base.py b/shpc/main/container/base.py
index de473bbc4..cf11da7af 100644
--- a/shpc/main/container/base.py
+++ b/shpc/main/container/base.py
@@ -71,10 +71,12 @@ class ContainerTechnology:
         """
         # If there is a config port, we need to add to end of URI
         port = ":" + str(config.port) if config.port else ""
+        print("port = %s" % port)
         if not port:
             return uri
         host, rest = uri.split("/", 1)
-        return host + ":" + port + "/" + rest
+        print("host + port + / + rest = %s" % (host + port + "/" + rest))
+        return host + port + "/" + rest

     def add_environment(self, module_dir, envars, environment_file):
         """

[nrcfieldsa@container-test singularity-hpc]$ shpc install lolnrc
port = :4000
host + port + / + rest = gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/lolnrc
singularity pull --name /home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc/latest/gitlabxyz.nrc-cnrc.gc.ca-nrcfieldsa-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif oras://gitlabxyz.nrc-cnrc.gc.ca:4000/nrcfieldsa/singularity-test/lolnrc@sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17
INFO:    Using cached SIF image
INFO:    Using cached SIF image
/home/nrcfieldsa/github.com/singularityhub/singularity-hpc/modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc/latest/gitlabxyz.nrc-cnrc.gc.ca-nrcfieldsa-singularity-test-lolnrc-latest-sha256:91b243d9397763ba3a7a5e12625507589544b4fc90db71db34072680d6926e17.sif
Module lolnrc:latest was created.

[nrcfieldsa@container-test singularity-hpc]$ ls -l modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc/latest/module.lua
-rw-rw-r--. 1 nrcfieldsa nrcfieldsa 5125 Feb  3 12:23 modules/gitlabxyz.nrc-cnrc.gc.ca/nrcfieldsa/singularity-test/lolnrc/latest/module.lua

It appears that the module name is correct as the '/nrcfieldsa/singularity-test/lolnrc' path matches the project path on the GitLab server gitlabxyz.nrc-cnrc.gc.ca accessed on port 4000.