NVIDIA / pyxis

Container plugin for Slurm Workload Manager
Apache License 2.0
282 stars 31 forks source link

ERROR while pulling images #71

Closed SolenoidWGT closed 2 years ago

SolenoidWGT commented 2 years ago

Hi,everyone I encountered some problems when running the pyxis test. It reported an error when I pulled the pytorch 20.02 image. The reason for the error was "curl: (22) The requested URL returned error: 401 Unauthorized". However I can see that “Authentication succeeded”. Why does this problem occur, am I missing some configuration?

Here is the error log:

✗ nvcr.io PyTorch 20.02 (from function run_srun' in file tests/./common.bash, line 32, in test file tests/docker_image.bats, line 22) run_srun --container-image=nvcr.io#nvidia/pytorch:20.02-py3 sh -c 'echo $PYTORCH_VERSION'' failed

Here is my system info: centos 7 enroot 3.4.0 slurm 20.11.8-6 pyxis 0.12.0 Docker 20.10.12

Thank you all

flx42 commented 2 years ago

Hello,

It might have been a temporary issue with the NVIDIA container registry at nvcr.io, is it still happening with 100% frequency?

You could also check if you have more luck with the very latest pytorch image:

$ enroot import docker://nvcr.io#nvidia/pytorch:22.01-py3
SolenoidWGT commented 2 years ago

Hello,

It might have been a temporary issue with the NVIDIA container registry at nvcr.io, is it still happening with 100% frequency?

You could also check if you have more luck with the very latest pytorch image:

$ enroot import docker://nvcr.io#nvidia/pytorch:22.01-py3

Thank you very much, the problem should be related to the network configuration on my system, I will deal with it later. And I can use enroot to download the container mirror directly, and then use the absolute path to point out the container mirror location in pyxis.