Lightning-AI / litdata

Streamline data pipelines for AI. Process datasets across 1000s of machines, and optimize data for blazing fast model training.
Apache License 2.0
249 stars 24 forks source link

AttributeError: `np.sctypes` was removed in the NumPy 2.0 release. #175

Open bhimrazy opened 2 weeks ago

bhimrazy commented 2 weeks ago

🐛 AttributeError: np.sctypes was removed in the NumPy 2.0 release.

    from litdata.constants import _IS_IN_STUDIO
  File "/Users/bhimrajyadav/test-litdata/venv/lib/python3.11/site-packages/litdata/constants.py", line 61, in <module>
    _NUMPY_SCTYPES = [v for values in np.sctypes.values() for v in values]
                                      ^^^^^^^^^^
  File "/Users/bhimrajyadav/test-litdata/venv/lib/python3.11/site-packages/numpy/__init__.py", line 397, in __getattr__
    raise AttributeError(
AttributeError: `np.sctypes` was removed in the NumPy 2.0 release. Access dtypes explicitly instead.. Did you mean: 'dtypes'?

To Reproduce

Steps to reproduce the behavior: Copy the given code form readme to eg: test.py and run

import numpy as np
from litdata import optimize
from PIL import Image

# Store random images into the data chunks
def random_images(index):
    data = {
        "index": index, # int data type
        "image": Image.fromarray(np.random.randint(0, 256, (32, 32, 3), np.uint8)), # PIL image data type
        "class": np.random.randint(10), # numpy array data type
    }
    # The data is serialized into bytes and stored into data chunks by the optimize operator.
    return data # The data is serialized into bytes and stored into data chunks by the optimize operator.

if __name__ == "__main__":
    optimize(
        fn=random_images,  # The function applied over each input.
        inputs=list(range(1000)),  # Provide any inputs. The fn is applied on each item.
        output_dir="my_optimized_dataset",  # The directory where the optimized data are stored.
        num_workers=4,  # The number of workers. The inputs are distributed among them.
        chunk_bytes="64MB"  # The maximum number of bytes to write into a data chunk.
    )

Environment

plra commented 1 week ago

Also experiencing this today. pytorch==2.3.0 via pip, Ubuntu 22.04, Python 3.11.9.

bhimrazy commented 1 week ago

Also experiencing this today. pytorch==2.3.0 via pip, Ubuntu 22.04, Python 3.11.9.

@plra, If the issue persists, you can fix it manually by installing a numpy version < 2.0.0.

plra commented 1 week ago

Also experiencing this today. pytorch==2.3.0 via pip, Ubuntu 22.04, Python 3.11.9.

@plra, If the issue persists, you can fix it manually by installing a numpy version < 2.0.0.

Thanks, yeah, adding explicit numpy<2.0.0 dep fixes this for now.

felipemfpeixoto commented 1 week ago

Hey man, were you able to fix it? I'm experiencing it with coremltools trying to convert my pytorch model into coreML. I already installed the 1.26.4 numpy's version but the issue persists

bhimrazy commented 1 week ago

Hey man, were you able to fix it? I'm experiencing it with coremltools trying to convert my pytorch model into coreML. I already installed the 1.26.4 numpy's version but the issue persists

I think if it throws same error then, it is coming from Numpy 2.0. Please try uninstalling it first, then reinstall and verify the version. FYI: @felipemfpeixoto

loodvn commented 1 week ago

For what it's worth, it seems the version was pinned in 0.2.12, after this commit: https://github.com/Lightning-AI/litdata/pull/161/files#diff-4d7c51b1efe9043e44439a949dfd92e5827321b34082903477fd04876edb7552

So you can update it by including litdata >= 0.2.12 or ~=0.2.12 in downstream requirements.txt files.

peacefulotter commented 6 days ago

For what it's worth, it seems the version was pinned in 0.2.12, after this commit: https://github.com/Lightning-AI/litdata/pull/161/files#diff-4d7c51b1efe9043e44439a949dfd92e5827321b34082903477fd04876edb7552

So you can update it by including litdata >= 0.2.12 or ~=0.2.12 in downstream requirements.txt files.

Not sure what you mean by "downstream requirements.txt". I tried pip install "litdata>=0.2.12" and am getting the same error :/

deependujha commented 6 days ago

For what it's worth, it seems the version was pinned in 0.2.12, after this commit: https://github.com/Lightning-AI/litdata/pull/161/files#diff-4d7c51b1efe9043e44439a949dfd92e5827321b34082903477fd04876edb7552 So you can update it by including litdata >= 0.2.12 or ~=0.2.12 in downstream requirements.txt files.

Not sure what you mean by "downstream requirements.txt". I tried pip install "litdata>=0.2.12" and am getting the same error :/

Try uninstalling previous version and then try reinstalling.

pip uninstall litdata
pip install litdata