Open ayyappagundu opened 1 month ago
Hi @ayyappagundu We seem to have some major problems with our dependencies. I try to get hold of it. I hope we can fix that in the nearer future. Thanks for the ticket.
Hi! Any update on this?
Hi @ayyappagundu we are moving from requirements.txt to pyproject.toml with poetry and hatch. Also we need to pin down a lot of dependencies. That acutally cost us some time, but we are on a good way.
In addition, we will probably have to overhaul our distribution backend as we are using outdated versions of ray. Is it important that you distribute your task or can you also run it on a compute node?
Description I'm encountering an error when installing ludwig[distributed] in a Jupyter Notebook environment running on a Dataproc cluster. The installation seems to proceed normally until it attempts to install scikit-learn, at which point the process fails.
Steps to Reproduce Launch a Dataproc cluster with a Jupyter Notebook environment. Open a new Jupyter Notebook within the cluster. Execute the command: !pip install ludwig[distributed]
Error Collecting scikit-learn (from ludwig[distributed]) Using cached https://nexus.onedev.neustar.biz/repository/ds-pypi-group/packages/scikit-learn/1.5.2/scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB) Using cached https://nexus.onedev.neustar.biz/repository/ds-pypi-group/packages/scikit-learn/1.5.1/scikit_learn-1.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB) Using cached https://nexus.onedev.neustar.biz/repository/ds-pypi-group/packages/scikit-learn/1.2.0/scikit_learn-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.5 MB) Using cached https://nexus.onedev.neustar.biz/repository/ds-pypi-group/packages/scikit-learn/1.1.3/scikit_learn-1.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.0 MB) Using cached https://nexus.onedev.xxxx.biz/repository/ds-pypi-group/packages/scikit-learn/1.1.2/scikit-learn-1.1.2.tar.gz (7.0 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... error error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [2269 lines of output] Partial import of sklearn during the build process. setup.py:128: DeprecationWarning:
Declare '_subtract_histograms' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
Use an 'int' return type on '_subtract_histograms' to allow an error code to be returned.
Error compiling Cython file:
... if n_used_bins <= 1: free(cat_infos) return
sklearn/ensemble/_hist_gradient_boosting/splitting.pyx:920:14: Cannot assign type 'int (const void , const void ) except? -1 nogil' to 'int ()(const void , const void *) noexcept nogil'. Exception values are incompatible. Suggest adding 'noexcept' to the type of 'compare_cat_infos'. Traceback (most recent call last): File "/tmp/pip-build-env-357_itq6/overlay/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 1345, in cythonize_one_helper
File "/opt/conda/miniconda3/lib/python3.11/multiprocessing/pool.py", line 774, in get raise self._value Cython.Compiler.Errors.CompileError: sklearn/ensemble/_hist_gradient_boosting/splitting.pyx [end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed
× Encountered error while generating package metadata. ╰─> See above for output.
Environment Dataproc Cluster: image-version 2.2-debian12 Jupyter Notebook (Testing the package on jupyter notebook using dataproc cluster) Python Version: 3.11.8 ! pip install ludwig[distributed] and tried with this package also ! pip install ludwig
Package Repository: nexus.onedev.xxxx.biz xxxx - masked for security (my organization name) private repository configured for installing pacakages Additional context I tried with multiple environments also Python 3.10.8 & Python 3.8.15 by downgrading the image version of dataproc cluster (GCP)