ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.11k stars 1.19k forks source link

MNIST Dataset can't be downloaded #4009

Closed mhabedank closed 3 months ago

mhabedank commented 4 months ago

Describe the bug It's not possible to download the MNIST dataset from it's source. For example the URL http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz is just forbidden even on the browser.

To Reproduce

Running this code

from ludwig.datasets import mnist
train_df, test_df = mnist.load(split=True)

will result in the error

Finding fallback mirrors to download the dataset. Downloading from the original source failed with the following error HTTP Error 403: Forbidden.
No fallback mirror found. Failed to download dataset MNIST.

Expected behavior The file should get downloaded.

Screenshots

image

Environment (please complete the following information):

Additional context It's not really a bug from ludwig. The source withholds the content. Maybe there should be different sources, or it could be possible to just to use torchs or sklearns data zoo.