NVIDIA / nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs
Apache License 2.0
17.21k stars 2.03k forks source link

Probable CDN problems! #1624

Closed hqnicolas closed 2 years ago

hqnicolas commented 2 years ago

7 13.24 E: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/Packages 404 Not Found

7 13.24 E: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/Packages 404 Not Found

bermeitinger-b commented 2 years ago

The same happens for the ubuntu2004 path: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/

Example:

E: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/./
cuda-compiler-11-4_11.4.4-1_amd64.deb  404  Not Found [IP: 152.199.20.126 443]
klueska commented 2 years ago

This is a known issue and internal teams at NVIDIA are working to resolve it.

hqnicolas commented 2 years ago

please someone Call to Lapsus$ Hackers if they have a backup from these folders

Climbgunks commented 2 years ago

piling on w/ 18.04 -- assuming it's the same issue:

The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release' does not have a Release file.

gaitanignacio commented 2 years ago

I'm getting the same issue right now on a docker build from an old Dockerfile that I saw working properly before!

Basically:

FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.6.0-gpu-py36-cu110-ubuntu18.04

RUN apt-get update
E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
lorriexingfang86 commented 2 years ago

I m getting the same error when I do: apt-get update

for the image: tensorflow/tensorflow:2.6.1-gpu

rrlamichhane commented 2 years ago

I'm getting an error as well, it's different than most that have been posted here. This is the error I'm getting, hopefully this will help the debuggers at nvidia resolve it sooner:

Reading package lists...
18:19:34
  W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY F60F4B3D7FA2AF80
18:19:34
  E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release' is not signed.
18:19:35
  The command '/bin/sh -c apt update && apt install -y cuda-compat-11-4' returned a non-zero code: 100
santiagoolivar2017 commented 2 years ago

Same here while executing RUN apt-get update --fix-missing apt-get install

I am getting the next error

Reading package lists...
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1710/x86_64  Release' does not have a Release file.
E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Release' does not have a Release file.
The command '/bin/sh -c apt-get update --fix-missing apt-get install returned a non-zero code: 100
ryan-williams commented 2 years ago

Minimal bash one-liner test, since I haven't seen anyone post it exactly:

docker run --rm nvidia/cuda:11.4.0-base apt-get update
Get:1 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease
Get:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB]
Ign:7 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64  InRelease
Get:8 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release [696 B]
Get:9 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB]
Err:12 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64  Release
  404  Not Found [IP: 152.195.19.142 443]
Get:13 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release.gpg [836 B]
Get:14 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [1160 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1149 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [30.3 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [2124 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [51.2 kB]
Get:19 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [26.0 kB]
Get:20 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages [629 kB]
Get:21 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [1708 kB]
Get:22 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [863 kB]
Get:23 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [1086 kB]
Get:24 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [25.8 kB]
Reading package lists...
E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64  Release' does not have a Release file.
raa2463 commented 2 years ago

Facing similar issue E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64 Release' does not have a Release file.

MusHusKat commented 2 years ago

Similar issue:

[ERROR] 31/03/2022, 11:16:29 am: E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release' no longer has a Release file.

tgaddair commented 2 years ago

For those looking for a workaround, in case you're just using one of these images as a parent and don't need to install any additional CUDA libraries, you can run the following before apt-get update:

RUN rm /etc/apt/sources.list.d/cuda.list && rm /etc/apt/sources.list.d/nvidia-ml.list

Depending on how you're building your image, you may need to run these commands with sudo.

This was a suggestion from the Nvidia forums, where this issue is also being tracked:

https://forums.developer.nvidia.com/t/error-apt-get-updating-from-nvidia-cuda11-2-1-base-ubuntu20-04/209836/8

tuong-olli commented 2 years ago

For those looking for a workaround, in case you're just using one of these images as a parent and don't need to install any additional CUDA libraries, you can run the following before apt-get update:

RUN rm /etc/apt/sources.list.d/cuda.list && rm /etc/apt/sources.list.d/nvidia-ml.list

Depending on how you're building your image, you may need to run these commands with sudo.

This was a suggestion from the Nvidia forums, where this issue is also being tracked:

https://forums.developer.nvidia.com/t/error-apt-get-updating-from-nvidia-cuda11-2-1-base-ubuntu20-04/209836/8

It works fine for me. Thanks you!

jimsparkman commented 2 years ago

Appears repos are back online again.

santiagoolivar2017 commented 2 years ago

Confirmed, working again for me!

lorriexingfang86 commented 2 years ago

confirmed here too. it works now!

elezar commented 2 years ago

Please see the link from https://github.com/NVIDIA/nvidia-docker/issues/1009#issuecomment-1083945079

Note that this may be due to a CDN issue with the CUDA downloads repositories and are not specific to NVIDIA docker. Closing this issue.

BITWN commented 2 years ago

confirmed here too, run the following command before apt-get update works for me! RUN rm /etc/apt/sources.list.d/cuda.list