kubeflow / website

Kubeflow Website
https://www.kubeflow.org
Creative Commons Attribution 4.0 International
151 stars 776 forks source link

[Feedback] (the dataset download link gets 403 error) docs/components/training/user-guides/pytorch.md | #3927

Open itaynvn-runai opened 5 days ago

itaynvn-runai commented 5 days ago

issue:

following this guide: https://www.kubeflow.org/docs/components/training/user-guides/pytorch/

which is using this image:

gcr.io/kubeflow-ci/pytorch-dist-mnist_test:1.0

that attempts to download this file:

http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

but as of today, requesting this link gets 403 status.

here you can see the proper output for this image: https://developer-qa.nvidia.com/blog/gpu-containers-runtime/#:~:text=Try%20running%20the%20MNIST%20training%20example%20included%20with%20the%20container%3A

suggestions:

  1. use links from this mirror instead, which is hosted on github and probably will be more reliable
    https://github.com/fgnt/mnist
  2. allow to provide links to these files using env vars, to prevent hardcoding links that might be dead sometime.

notes: i assume this link is hardcoded in a script which is used in the dockerfile used to build this image. i found several references to this link across the kubeflow github: https://github.com/search?q=org%3Akubeflow%20%22train-images-idx3-ubyte.gz%22&type=code but couldn't trace the dockerfile used to build this image, nor detect which of these scripts was used in it.

itaynvn-runai commented 2 days ago

tested with this image: kubeflow/pytorch-dist-mnist:latest(latest tag, pushed at 22/11/2024) https://hub.docker.com/r/kubeflow/pytorch-dist-mnist/tags

the links were switched to a public S3 bucket, and download process completes:

Using distributed PyTorch with gloo backend
World Size: 2. Rank: 1
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ../data/FashionMNIST/raw/train-images-idx3-ubyte.gz

  0%|          | 0/26421880 [00:00<?, ?it/s]
  0%|          | 65536/26421880 [00:00<01:12, 365219.76it/s]
  1%|          | 229376/26421880 [00:00<00:38, 685094.04it/s]
  3%|▎         | 917504/26421880 [00:00<00:09, 2610886.88it/s]
  7%|▋         | 1933312/26421880 [00:00<00:05, 4111033.66it/s]
 26%|██▌       | 6848512/26421880 [00:00<00:01, 16200010.18it/s]
 38%|███▊      | 10059776/26421880 [00:00<00:00, 20608644.80it/s]
 47%|████▋     | 12517376/26421880 [00:01<00:00, 17876773.56it/s]
 64%|██████▍   | 16973824/26421880 [00:01<00:00, 24547329.01it/s]
 84%|████████▍ | 22315008/26421880 [00:01<00:00, 26412748.88it/s]
 98%|█████████▊| 25985024/26421880 [00:01<00:00, 24075278.44it/s]
100%|██████████| 26421880/26421880 [00:01<00:00, 16889476.36it/s]
Extracting ../data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ../data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw/train-labels-idx1-ubyte.gz

  0%|          | 0/29515 [00:00<?, ?it/s]
100%|██████████| 29515/29515 [00:00<00:00, 325193.23it/s]
Extracting ../data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ../data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz

  0%|          | 0/4422102 [00:00<?, ?it/s]
  1%|▏         | 65536/4422102 [00:00<00:12, 361558.72it/s]
  5%|▌         | 229376/4422102 [00:00<00:06, 681986.84it/s]
 21%|██        | 917504/4422102 [00:00<00:01, 2593771.27it/s]
 44%|████▎     | 1933312/4422102 [00:00<00:00, 4090096.69it/s]
100%|██████████| 4422102/4422102 [00:00<00:00, 6085832.68it/s]
Extracting ../data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ../data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz

FYI this new image should replace these 2 old images, currently used in alot of the examples across the repo:

gcr.io/kubeflow-ci/pytorch-dist-mnist_test:1.0 (latest tag, pushed at 07/03/2019) https://console.cloud.google.com/gcr/images/kubeflow-ci/global/pytorch-dist-mnist_test

gcr.io/kubeflow-ci/pytorch-dist-mnist-test:v1.0 (latest tag, pushed at 03/03/2019) https://console.cloud.google.com/gcr/images/kubeflow-ci/global/pytorch-dist-mnist-test