NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
266 stars 30 forks source link

enroot import container_image #70

Closed cheyunfei closed 2 years ago

cheyunfei commented 2 years ago

Hello, I'm trying run a container image by slurm. And I have installed pyxis and enroot.

But when the program start , there is an error happend. "URL https://registry-1.docker.io/v2/library/mlperf-nvidia/manifests/minigo returned error code: 401 Unauthorized"

Finally, I figure out that when Pyxis calls the "enroot" command, Pyxis will use "enroot import docker://image[:TAG]" by default. It will pull container image from a remote registry and I haven't a account for this website, "https://registry-1.docker.io/v2".

Now, I want Pyxis to call "enroot import dockerd://image[:TAG]" instead of "enroot import docker://image[:TAG]". Then, I can use container image from the Docker daemon instead of from a remote registry.

So, how can I do ? Thanks.

cheyunfei commented 2 years ago

Or, how can I config some environment variables ?

cheyunfei commented 2 years ago

My command is : srun --container-image="mlperf-nvidia:minigo" --container-name="minigo"

I get an error: URL https://registry-1.docker.io/v2/library/mlperf-nvidia/manifests/minigo returned error code: 401 Unauthorized

when I use: srun --container-image="dockerd://mlperf-nvidia:minigo" --container-name="minigo"

I get an error: error: pyxis: [ERROR] Invalid image reference: docker://dockerd://mlperf-nvidia:minigo

flx42 commented 2 years ago

Sorry, but you can't use the dockerd:// import feature from enroot when using pyxis. Slurm clusters usually don't have the docker daemon running.

You can do something like this instead:

$ enroot import dockerd://image:tag
$ srun --container-image ./image+tag.sqsh [...]
cheyunfei commented 2 years ago

Sorry, but you can't use the dockerd:// import feature from enroot when using pyxis. Slurm clusters usually don't have the docker daemon running.

You can do something like this instead:

$ enroot import dockerd://image:tag
$ srun --container-image ./image+tag.sqsh [...]

Thanks , this problem have been solved. I have got a squashfs file by "enroot import dockerd://image:tag"