keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.97k stars 19.46k forks source link

tf.keras.datasets.cifar10.load_data - FileNotFoundError: [Errno 2] No such file or directory #20180

Closed ashep29 closed 2 months ago

ashep29 commented 2 months ago

Issue with tar.gz file extraction - inconsistent behaviour across systems/platforms:

(train_images, train_labels), (test_images, test_labels) = keras.datasets.cifar10.load_data()

NAME="Rocky Linux" VERSION="8.10 (Green Obsidian)" ID="rocky" ID_LIKE="rhel centos fedora" VERSION_ID="8.10" PLATFORM_ID="platform:el8"

Python 3.11.0rcl

absl-py==2.1.0 anyio==4.4.0 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 array_record==0.5.1 arrow==1.3.0 asttokens==2.4.1 astunparse==1.6.3 async-lru==2.0.4 attrs==24.2.0 babel==2.16.0 beautifulsoup4==4.12.3 bleach==6.1.0 blinker==1.4 certifi==2024.7.4 cffi==1.17.0 charset-normalizer==3.3.2 click==8.1.7 comm==0.2.2 contourpy==1.3.0 cryptography==3.4.8 cycler==0.12.1 dbus-python==1.2.18 debugpy==1.8.5 decorator==5.1.1 defusedxml==0.7.1 distro==1.7.0 dm-tree==0.1.8 docstring_parser==0.16 etils==1.9.2 executing==2.0.1 fastjsonschema==2.20.0 flatbuffers==24.3.25 fonttools==4.53.1 fqdn==1.5.1 fsspec==2024.6.1 gast==0.6.0 google-pasta==0.2.0 googleapis-common-protos==1.65.0 grpcio==1.64.1 h11==0.14.0 h5py==3.11.0 httpcore==1.0.5 httplib2==0.20.2 httpx==0.27.2 idna==3.7 immutabledict==4.2.0 importlib-metadata==4.6.4 importlib_resources==6.4.4 ipykernel==6.29.5 ipython==8.26.0 ipywidgets==8.1.5 isoduration==20.11.0 jedi==0.19.1 jeepney==0.7.1 Jinja2==3.1.4 joblib==1.4.2 json5==0.9.25 jsonpointer==3.0.0 jsonschema==4.23.0 jsonschema-specifications==2023.12.1 jupyter==1.0.0 jupyter-console==6.6.3 jupyter-events==0.10.0 jupyter-lsp==2.2.5 jupyter_client==8.6.2 jupyter_core==5.7.2 jupyter_server==2.14.2 jupyter_server_terminals==0.5.3 jupyterlab==4.2.5 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.3 jupyterlab_widgets==3.0.13 kagglehub==0.2.9 keras==3.4.1 keras-core==0.1.7 keras-cv==0.9.0 keras-nlp==0.14.4 keyring==23.5.0 kiwisolver==1.4.5 launchpadlib==1.10.16 lazr.restfulclient==0.14.4 lazr.uri==1.0.6 libclang==18.1.1 Markdown==3.6 markdown-it-py==3.0.0 MarkupSafe==2.1.5 matplotlib==3.9.2 matplotlib-inline==0.1.7 mdurl==0.1.2 mistune==3.0.2 ml-dtypes==0.4.0 more-itertools==8.10.0 mplfinance==0.12.10b0 namex==0.0.8 nbclient==0.10.0 nbconvert==7.16.4 nbformat==5.10.4 nest-asyncio==1.6.0 notebook==7.2.2 notebook_shim==0.2.4 numpy==1.26.4 oauthlib==3.2.0 opt-einsum==3.3.0 optree==0.12.1 overrides==7.7.0 packaging==24.1 pandas==2.2.2 pandocfilters==1.5.1 parso==0.8.4 pexpect==4.9.0 pillow==10.4.0 platformdirs==4.2.2 prometheus_client==0.20.0 promise==2.3 prompt_toolkit==3.0.47 protobuf==4.25.3 psutil==6.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 pyarrow==17.0.0 pycparser==2.22 pydot==3.0.1 Pygments==2.18.0 PyGObject==3.42.1 PyJWT==2.3.0 pyparsing==3.1.4 python-apt==2.4.0+ubuntu3 python-dateutil==2.9.0.post0 python-json-logger==2.0.7 pytz==2024.1 PyYAML==6.0.2 pyzmq==26.2.0 qtconsole==5.6.0 QtPy==2.4.1 referencing==0.35.1 regex==2024.7.24 requests==2.32.3 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==13.7.1 rpds-py==0.20.0 scikit-learn==1.5.1 scipy==1.14.1 SecretStorage==3.3.1 Send2Trash==1.8.3 simple_parsing==0.1.5 six==1.16.0 sniffio==1.3.1 soupsieve==2.6 stack-data==0.6.3 tensorboard==2.17.0 tensorboard-data-server==0.7.2 tensorflow==2.17.0 tensorflow-datasets==4.9.6 tensorflow-io-gcs-filesystem==0.37.1 tensorflow-metadata==1.15.0 tensorflow-text==2.17.0 termcolor==2.4.0 terminado==0.18.1 threadpoolctl==3.5.0 tinycss2==1.3.0 toml==0.10.2 tornado==6.4.1 tqdm==4.66.5 traitlets==5.14.3 types-python-dateutil==2.9.0.20240821 typing_extensions==4.12.2 tzdata==2024.1 uri-template==1.3.0 urllib3==2.2.2 wadllib==1.3.6 wcwidth==0.2.13 webcolors==24.8.0 webencodings==0.5.1 websocket-client==1.8.0 Werkzeug==3.0.3 widgetsnbextension==4.0.13 wrapt==1.16.0 zipp==1.0.0

error:

FileNotFoundError Traceback (most recent call last) Cell In[25], line 1 ----> 1 (train_images, train_labels), (test_images, test_labels) = keras.datasets.cifar10.load_data() File ~/.conda/envs/env_name/lib/python3.12/site-packages/keras/src/datasets/cifar10.py:84, in load_data() 79 for i in range(1, 6): 80 fpath = os.path.join(path, "databatch" + str(i)) 81 ( 82 x_train[(i - 1) 10000 : i 10000, :, :, :], 83 y_train[(i - 1) 10000 : i 10000], ---> 84 ) = load_batch(fpath) 86 fpath = os.path.join(path, "test_batch") 87 x_test, y_test = load_batch(fpath)

File ~/.conda/envs/env_name/lib/python3.12/site-packages/keras/src/datasets/cifar.py:17, in load_batch(fpath, label_key) 6 def load_batch(fpath, label_key="labels"): 7 """Internal utility for parsing CIFAR data. 8 9 Args: (...) 15 A tuple (data, labels). 16 """ ---> 17 with open(fpath, "rb") as f: 18 d = cPickle.load(f, encoding="bytes") 19 # decode utf8

FileNotFoundError: [Errno 2] No such file or directory: '/home/username/.keras/datasets/cifar-10-batches-py/data_batch_1'

Data is being saved to '/home/username/.keras/datasets/cifar-10-batches-py/cifar-10-batches-py/data_batch_1'

mehtamansi29 commented 2 months ago

Hi @ashep29 -

Thanks for reporting the issue. I am not able to reproduce this issue on RHEL CentOS Stream release 9 machine with Python env: Python 3.11.9 and keras version: 3.5.0. Can you try to upgrade keras version(pip install --upgrade keras) and try to rerun the code ?

ashep29 commented 2 months ago

Thanks for taking a look.

I'm running a jupyter notebook (screenshot below) with tensorflow install using docker://tensorflow/tensorflow:latest-gpu (which installs keras). I've separately installed keras-cv and keras-nlp.

image

Even after the keras upgrade, the error persists: image

ghsanti commented 2 months ago

Actually, I realised you were right while testing smth else, it does reproduce in Colab.

Here is a simple repro (unrelated code, but shows up in first cells.)

Tagging @mehtamansi29 since he may realise what's wrong (this uses 3.5.0)

https://colab.research.google.com/gist/ghsanti/f5111b30cd9dbe778d0f127d6b7b7647/print.ipynb

ashep29 commented 2 months ago

Thanks @ghsanti, definitely the same error you have reproduced. Hopefully this can be resolved soon!

ghsanti commented 2 months ago

It happens with cifar100 as well, but not with MNIST.

Datasets' loading code are here: https://github.com/keras-team/keras/tree/master/keras/src/datasets

ashep29 commented 2 months ago

Perhaps it has something to do with how the tar.gz is being extracted? Seems to be a difference between mnist and the cifar datasets.

ghsanti commented 2 months ago

The files are downloaded to:

/root/.keras/datasets/cifar-10-batches-py/cifar-10-batches-py

Here is where it looks for:

/root/.keras/datasets/cifar-10-batches-py/data_batch_1

So there is a repetition of cifar-10-batches-py before the batches' files. Does commenting out dirname work? (It may be a duplication by some un-tarring artifact, just like you suggested.)

ashep29 commented 2 months ago

Thanks @ghsanti, I've commented out the dirname argument, but the value returned by get_file is still '/home/username/.keras/datasets/cifar-10-batches-py/data_batch_1', so the error persists. Perhaps the error actually lies in keras.src.utils.file_utils (get_file)?

dwgily commented 2 months ago

If you get it please, let us know!!!

ghsanti commented 2 months ago

This works in colab @ashep29 @himanshurana-27

    origin = "https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
    path = get_file(
        fname="cifar-10-batches-py",
        cache_dir="cifar-10-batches-py",
        origin=origin,
        untar=True,
        file_hash=(  # noqa: E501
            "6d958be074577803d12ecdefd02955f39262c83c16fe9348329d7fe0b5c001ce"
        ),

Added cache_dir since fname can not use /.

The function recommends replacing untar with extract, but those do not do the same thing, so I left untar.

ashep29 commented 2 months ago

OK so the only change that needed to be done was to add extract=True in the get_file method. It is now working. Didn't need the cache_dir and left fname as it was originally, so final working code:

dirname = "cifar-10-batches-py"
    origin = "https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
    path = get_file(
        fname=dirname,
        origin=origin,
        extract=True,
        untar=True,
        file_hash=(  # noqa: E501
            "6d958be074577803d12ecdefd02955f39262c83c16fe9348329d7fe0b5c001ce"
        ),
    )

Thank you for your help!

ghsanti commented 2 months ago

That's another possibility indeed @ashep29 :-) May need to be added cifar100 as well.

Thank you for your help!

My pleasure

mehtamansi29 commented 2 months ago

Actually, I realised you were right while testing smth else, it does reproduce in Colab.

Here is a simple repro (unrelated code, but shows up in first cells.)

Tagging @mehtamansi29 since he may realise what's wrong (this uses 3.5.0)

https://colab.research.google.com/gist/ghsanti/f5111b30cd9dbe778d0f127d6b7b7647/print.ipynb

Hi @ashep29 , @ghsanti -

This issue is reproduce with keras nightly(3.5.0.dev2024082903) version. Keras 3.5.0 its working fine. Attached the gist for your reference. We will look into the issue and update you on that.

ghsanti commented 2 months ago

yes, there is a PR just above your comment with more information in case you want to look at it. @mehtamansi29

google-ml-butler[bot] commented 2 months ago

Are you satisfied with the resolution of your issue? Yes No