microsoft / FLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
https://microsoft.github.io/FLAML/
MIT License
3.75k stars 495 forks source link

Incompatibility with NumPy 2.0.0: Module Crashes Due to Binary Incompatibility and Attribute Errors #1315

Open Programmer-RD-AI opened 1 week ago

Programmer-RD-AI commented 1 week ago

Issue Title

Incompatibility with NumPy 2.0.0: Module Crashes Due to Binary Incompatibility and Attribute Errors

Issue Description

Description:

I encountered issues when running the flaml module with NumPy 2.0.0. The errors suggest a binary incompatibility with NumPy 2.0.0, causing crashes and attribute errors. Specifically, there are issues with the _ARRAY_API attribute not being found and a ValueError indicating that the numpy.dtype size has changed.

Steps to Reproduce:

  1. Install NumPy 2.0.0 in the environment.
  2. Try to run a script that imports and uses the flaml module.
  3. Observe the following traceback:
Traceback (most recent call last):
  File "/home/ranuga/testing/fdsfds.py", line 1, in <module>
    from flaml import AutoML
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/__init__.py", line 3, in <module>
    from flaml.automl import AutoML, logger_formatter
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/__init__.py", line 1, in <module>
    from flaml.automl.automl import AutoML, size
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/automl.py", line 19, in <module>
    from flaml.automl.ml import train_estimator
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/ml.py", line 11, in <module>
    from flaml.automl.data import group_counts
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/data.py", line 11, in <module>
    from flaml.automl.spark import DataFrame, Series, pd, ps, psDataFrame, psSeries
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/spark/__init__.py", line 31, in <module>
    import pandas as pd
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pandas/__init__.py", line 23, in <module>
    from pandas.compat import (
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pandas/compat/__init__.py", line 27, in <module>
    from pandas.compat.pyarrow import (
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pandas/compat/pyarrow.py", line 8, in <module>
    import pyarrow as pa
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
AttributeError: _ARRAY_API not found
Traceback (most recent call last):
  File "/home/ranuga/testing/fdsfds.py", line 1, in <module>
    from flaml import AutoML
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/__init__.py", line 3, in <module>
    from flaml.automl import AutoML, logger_formatter
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/__init__.py", line 1, in <module>
    from flaml.automl.automl import AutoML, size
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/automl.py", line 19, in <module>
    from flaml.automl.ml import train_estimator
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/ml.py", line 11, in <module>
    from flaml.automl.data import group_counts
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/data.py", line 11, in <module>
    from flaml.automl.spark import DataFrame, Series, pd, ps, psDataFrame, psSeries
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/flaml/automl/spark/__init__.py", line 31, in <module>
    import pandas as pd
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pandas/__init__.py", line 46, in <module>
    from pandas.core.api import (
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pandas/core/api.py", line 1, in <module>
    from pandas._libs import (
  File "/home/ranuga/anaconda3/lib/python3.11/site-packages/pandas/_libs/__init__.py", line 18, in <module>
    from pandas._libs.interval import Interval
  File "interval.pyx", line 1, in init pandas._libs.interval
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Expected Behavior:

The flaml module should work seamlessly with NumPy 2.0.0 without causing any attribute errors or value errors related to binary incompatibility.

Workaround:

Currently, the easiest workaround for users is to downgrade to NumPy 1.x (numpy<2) or try upgrading the affected module if a compatible version is available.

Environment:

Additional Information:

Please investigate this issue and provide a fix or guidance on how to make the flaml module compatible with NumPy 2.0.0. Thank you!

Labels:

Relevant Files

Please include any files, logs, or code snippets that could help with debugging the issue.