NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
43 stars 34 forks source link

Fix Python runtime error caused by numpy 2.0.0 release #1128

Closed amahussein closed 2 weeks ago

amahussein commented 2 weeks ago

Signed-off-by: Ahmed Hussein (amahussein) a@ahussein.me

Fixes #1127

This code change fixes a runtime error caused by pandas loading numpy2.1+ which conflicts with pyArrow requiring numpy1+ In this commit, the fix is to:

Note that:

amahussein commented 2 weeks ago

Shap does not support numpy2.0 yet. The runtime shows the stack trace but it continues to run.

In that case, the fix is to dynamically import shap if it can be supported. Otherwise, we cannot support all python 3.8-3.12

Traceback (most recent call last):  File "~/.venv/bin/spark_rapids", line 5, in <module>
    from spark_rapids_tools.cmdli.tools_cli import main
  File "~/user_tools/src/spark_rapids_tools/cmdli/__init__.py", line 17, in <module>
    from .tools_cli import ToolsCLI
  File "~/user_tools/src/spark_rapids_tools/cmdli/tools_cli.py", line 20, in <module>
    from spark_rapids_tools.cmdli.argprocessor import AbsToolUserArgModel
  File "~/user_tools/src/spark_rapids_tools/cmdli/argprocessor.py", line 32, in <module>
    from spark_rapids_pytools.rapids.qualification import QualGpuClusterReshapeType
  File "~/user_tools/src/spark_rapids_pytools/rapids/qualification.py", line 33, in <module>
    from spark_rapids_tools.tools.qualx.qualx_main import predict
  File "~/user_tools/src/spark_rapids_tools/tools/qualx/qualx_main.py", line 32, in <module>
    from spark_rapids_tools.tools.qualx.model import (
  File "~/user_tools/src/spark_rapids_tools/tools/qualx/model.py", line 19, in <module>
    import shap
  File "~/.venv/lib/python3.10/site-packages/shap/__init__.py", line 4, in <module>
    from .explainers import other
  File "~/.venv/lib/python3.10/site-packages/shap/explainers/__init__.py", line 4, in <module>
    from ._gpu_tree import GPUTreeExplainer
  File "~/.venv/lib/python3.10/site-packages/shap/explainers/_gpu_tree.py", line 5, in <module>
    from ._tree import (
  File "~/.venv/lib/python3.10/site-packages/shap/explainers/_tree.py", line 29, in <module>
    from .. import _cext
AttributeError: _ARRAY_API not found
amahussein commented 2 weeks ago

closing this PR in favour of #1130