When I run the examples/pytorch/image-classification/create-pytorchjob.ipynb file, the "pytorch-dist-mnist-test:v1.0" image is using https://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz url to download the mnist training dataset, but url is currently not working
Error:
Defaulted container "pytorch" out of: pytorch, init-pytorch (init)
Using distributed PyTorch with gloo backend
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Traceback (most recent call last):
File "/var/mnist.py", line 150, in <module>
main()
File "/var/mnist.py", line 123, in main
transforms.Normalize((0.1307,), (0.3081,))
File "/opt/conda/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/mnist.py", line 46, in __init__
epoch, batch_idx * len(data), len(train_loader.dataset),
File "/opt/conda/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/mnist.py", line 114, in download
if should_distribute():
File "/opt/conda/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/opt/conda/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/opt/conda/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/opt/conda/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/opt/conda/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
There is seems to be the same dataset hosted at https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz which can be replaced with.
What happened?
When I run the examples/pytorch/image-classification/create-pytorchjob.ipynb file, the "pytorch-dist-mnist-test:v1.0" image is using
https://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
url to download the mnist training dataset, but url is currently not workingError:
There is seems to be the same dataset hosted at
https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
which can be replaced with.ref: https://github.com/pytorch/vision/blob/6d7851bd5e2bedc294e40e90532f0e375fcfee04/torchvision/datasets/mnist.py#L39
What did you expect to happen?
Ideally "pytorch-dist-mnist-test:v1.0" image should be updated or should provide a replacement image
Environment
Kubernetes version:
Training Operator version:
Training Operator Python SDK version:
Impacted by this bug?
Give it a š We prioritize the issues with most š