OscarKjell / text

Using Transformers from HuggingFace in R
https://r-text.org
135 stars 30 forks source link

`textrpp_install` fails to install hdbscan 0.8.33 #190

Open m-pilarski opened 2 months ago

m-pilarski commented 2 months ago

Hey,

I cannot install the Python dependencies by running textrpp_install() on Linux (fully updated Manjaro).

After removing the hdbscan version specification from rpp_version=, it works. 🤷

text::textrpp_install(
  rpp_version=c(
    "torch==2.2.0", "transformers==4.38.0", 
    "huggingface_hub==0.20.0", "numpy==1.26.0", 
    "pandas==2.0.3", "nltk==3.8.1", "scikit-learn==1.3.0", 
    "datasets==2.16.1", "evaluate==0.4.0", "accelerate==0.26.0", 
    "bertopic==0.16.3", "jsonschema==4.19.2", "sentence-transformers==2.2.2", 
    "flair==0.13.0", "umap-learn==0.5.6", "hdbscan", 
    "scipy==1.10.1"
  )
)
Building wheel for hdbscan (pyproject.toml): finished with status 'error'
  error: subprocess-exited-with-error

  × Building wheel for hdbscan (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [140 lines of output]
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py:268: UserWarning: Unknown distribution option: 'test_suite'
        warnings.warn(msg)
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py:268: UserWarning: Unknown distribution option: 'tests_require'
        warnings.warn(msg)
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-39
      creating build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/__init__.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/flat.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/hdbscan_.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/plots.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/prediction.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/robust_single_linkage_.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      copying hdbscan/validity.py -> build/lib.linux-x86_64-cpython-39/hdbscan
      creating build/lib.linux-x86_64-cpython-39/hdbscan/tests
      copying hdbscan/tests/__init__.py -> build/lib.linux-x86_64-cpython-39/hdbscan/tests
      copying hdbscan/tests/test_flat.py -> build/lib.linux-x86_64-cpython-39/hdbscan/tests
      copying hdbscan/tests/test_hdbscan.py -> build/lib.linux-x86_64-cpython-39/hdbscan/tests
      copying hdbscan/tests/test_prediction_utils.py -> build/lib.linux-x86_64-cpython-39/hdbscan/tests
      copying hdbscan/tests/test_rsl.py -> build/lib.linux-x86_64-cpython-39/hdbscan/tests
      running build_ext
      cythoning hdbscan/_hdbscan_tree.pyx to hdbscan/_hdbscan_tree.c
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-ctq8jvpp/hdbscan_ae385d94641e47f89c8d4ed7f8e51a3e/hdbscan/_hdbscan_tree.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      cythoning hdbscan/_hdbscan_linkage.pyx to hdbscan/_hdbscan_linkage.c
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-ctq8jvpp/hdbscan_ae385d94641e47f89c8d4ed7f8e51a3e/hdbscan/_hdbscan_linkage.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      cythoning hdbscan/_hdbscan_boruvka.pyx to hdbscan/_hdbscan_boruvka.c
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-ctq8jvpp/hdbscan_ae385d94641e47f89c8d4ed7f8e51a3e/hdbscan/_hdbscan_boruvka.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      cythoning hdbscan/_hdbscan_reachability.pyx to hdbscan/_hdbscan_reachability.c
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-ctq8jvpp/hdbscan_ae385d94641e47f89c8d4ed7f8e51a3e/hdbscan/_hdbscan_reachability.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      cythoning hdbscan/_prediction_utils.pyx to hdbscan/_prediction_utils.c
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-ctq8jvpp/hdbscan_ae385d94641e47f89c8d4ed7f8e51a3e/hdbscan/_prediction_utils.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      cythoning hdbscan/dist_metrics.pyx to hdbscan/dist_metrics.c
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-ctq8jvpp/hdbscan_ae385d94641e47f89c8d4ed7f8e51a3e/hdbscan/dist_metrics.pxd
        tree = Parsing.p_module(s, pxd, full_module_name)
      building 'hdbscan._hdbscan_tree' extension
      creating build/temp.linux-x86_64-cpython-39
      creating build/temp.linux-x86_64-cpython-39/hdbscan
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -I/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include/python3.9 -I/tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include -c hdbscan/_hdbscan_tree.c -o build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_tree.o
      In file included from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                       from hdbscan/_hdbscan_tree.c:752:
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_tree.o -o build/lib.linux-x86_64-cpython-39/hdbscan/_hdbscan_tree.cpython-39-x86_64-linux-gnu.so
      building 'hdbscan._hdbscan_linkage' extension
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -I/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include/python3.9 -I/tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include -c hdbscan/_hdbscan_linkage.c -o build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_linkage.o
      In file included from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                       from hdbscan/_hdbscan_linkage.c:752:
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_linkage.o -o build/lib.linux-x86_64-cpython-39/hdbscan/_hdbscan_linkage.cpython-39-x86_64-linux-gnu.so
      building 'hdbscan._hdbscan_boruvka' extension
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -I/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include/python3.9 -I/tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include -c hdbscan/_hdbscan_boruvka.c -o build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_boruvka.o
      In file included from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                       from hdbscan/_hdbscan_boruvka.c:752:
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_boruvka.o -o build/lib.linux-x86_64-cpython-39/hdbscan/_hdbscan_boruvka.cpython-39-x86_64-linux-gnu.so
      building 'hdbscan._hdbscan_reachability' extension
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -I/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include/python3.9 -I/tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include -c hdbscan/_hdbscan_reachability.c -o build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_reachability.o
      In file included from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                       from hdbscan/_hdbscan_reachability.c:752:
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib build/temp.linux-x86_64-cpython-39/hdbscan/_hdbscan_reachability.o -o build/lib.linux-x86_64-cpython-39/hdbscan/_hdbscan_reachability.cpython-39-x86_64-linux-gnu.so
      building 'hdbscan._prediction_utils' extension
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -I/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include/python3.9 -I/tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include -c hdbscan/_prediction_utils.c -o build/temp.linux-x86_64-cpython-39/hdbscan/_prediction_utils.o
      In file included from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                       from hdbscan/_prediction_utils.c:752:
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -Wl,-rpath-link,/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib -L/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/lib build/temp.linux-x86_64-cpython-39/hdbscan/_prediction_utils.o -o build/lib.linux-x86_64-cpython-39/hdbscan/_prediction_utils.cpython-39-x86_64-linux-gnu.so
      building 'hdbscan.dist_metrics' extension
      gcc -pthread -B /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/compiler_compat -Wl,--sysroot=/ -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -O2 -isystem /home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include -fPIC -I/home/moritz/.local/share/r-miniconda/envs/textrpp_condaenv/include/python3.9 -I/tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include -c hdbscan/dist_metrics.c -o build/temp.linux-x86_64-cpython-39/hdbscan/dist_metrics.o
      In file included from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                       from hdbscan/dist_metrics.c:752:
      /tmp/pip-build-env-gmr9e9gm/overlay/lib/python3.9/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      hdbscan/dist_metrics.c: In function ‘__pyx_f_7hdbscan_12dist_metrics_18SEuclideanDistance_dist’:
      hdbscan/dist_metrics.c:7338:75: error: passing argument 1 of ‘__pyx_f_7hdbscan_12dist_metrics_18SEuclideanDistance_rdist’ from incompatible pointer type [-Wincompatible-pointer-types]
       7338 |   __pyx_t_1 = __pyx_f_7hdbscan_12dist_metrics_18SEuclideanDistance_rdist(((struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *)__pyx_v_self), __pyx_v_x1, __pyx_v_x2, __pyx_v_size); if (unlikely(__pyx_t_1 == ((__pyx_t_7hdbscan_12dist_metrics_DTYPE_t)-1.0))) __PYX_ERR(1, 474, __pyx_L1_error)
            |                                                                          ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            |                                                                           |
            |                                                                           struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *
      hdbscan/dist_metrics.c:7135:168: note: expected ‘struct __pyx_obj_7hdbscan_12dist_metrics_SEuclideanDistance *’ but argument is of type ‘struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *’
       7135 | static __pyx_t_7hdbscan_12dist_metrics_DTYPE_t __pyx_f_7hdbscan_12dist_metrics_18SEuclideanDistance_rdist(struct __pyx_obj_7hdbscan_12dist_metrics_SEuclideanDistance *__pyx_v_self, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x1, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x2, __pyx_t_7hdbscan_12dist_metrics_ITYPE_t __pyx_v_size) {
            |                                                                                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      hdbscan/dist_metrics.c: In function ‘__pyx_f_7hdbscan_12dist_metrics_17MinkowskiDistance_dist’:
      hdbscan/dist_metrics.c:8114:74: error: passing argument 1 of ‘__pyx_f_7hdbscan_12dist_metrics_17MinkowskiDistance_rdist’ from incompatible pointer type [-Wincompatible-pointer-types]
       8114 |   __pyx_t_1 = __pyx_f_7hdbscan_12dist_metrics_17MinkowskiDistance_rdist(((struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *)__pyx_v_self), __pyx_v_x1, __pyx_v_x2, __pyx_v_size); if (unlikely(__pyx_t_1 == ((__pyx_t_7hdbscan_12dist_metrics_DTYPE_t)-1.0))) __PYX_ERR(1, 563, __pyx_L1_error)
            |                                                                         ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            |                                                                          |
            |                                                                          struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *
      hdbscan/dist_metrics.c:8030:166: note: expected ‘struct __pyx_obj_7hdbscan_12dist_metrics_MinkowskiDistance *’ but argument is of type ‘struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *’
       8030 | static __pyx_t_7hdbscan_12dist_metrics_DTYPE_t __pyx_f_7hdbscan_12dist_metrics_17MinkowskiDistance_rdist(struct __pyx_obj_7hdbscan_12dist_metrics_MinkowskiDistance *__pyx_v_self, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x1, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x2, __pyx_t_7hdbscan_12dist_metrics_ITYPE_t __pyx_v_size) {
            |                                                                                                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      hdbscan/dist_metrics.c: In function ‘__pyx_f_7hdbscan_12dist_metrics_18WMinkowskiDistance_dist’:
      hdbscan/dist_metrics.c:8812:75: error: passing argument 1 of ‘__pyx_f_7hdbscan_12dist_metrics_18WMinkowskiDistance_rdist’ from incompatible pointer type [-Wincompatible-pointer-types]
       8812 |   __pyx_t_1 = __pyx_f_7hdbscan_12dist_metrics_18WMinkowskiDistance_rdist(((struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *)__pyx_v_self), __pyx_v_x1, __pyx_v_x2, __pyx_v_size); if (unlikely(__pyx_t_1 == ((__pyx_t_7hdbscan_12dist_metrics_DTYPE_t)-1.0))) __PYX_ERR(1, 622, __pyx_L1_error)
            |                                                                          ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            |                                                                           |
            |                                                                           struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *
      hdbscan/dist_metrics.c:8619:168: note: expected ‘struct __pyx_obj_7hdbscan_12dist_metrics_WMinkowskiDistance *’ but argument is of type ‘struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *’
       8619 | static __pyx_t_7hdbscan_12dist_metrics_DTYPE_t __pyx_f_7hdbscan_12dist_metrics_18WMinkowskiDistance_rdist(struct __pyx_obj_7hdbscan_12dist_metrics_WMinkowskiDistance *__pyx_v_self, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x1, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x2, __pyx_t_7hdbscan_12dist_metrics_ITYPE_t __pyx_v_size) {
            |                                                                                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      hdbscan/dist_metrics.c: In function ‘__pyx_f_7hdbscan_12dist_metrics_19MahalanobisDistance_dist’:
      hdbscan/dist_metrics.c:9641:76: error: passing argument 1 of ‘__pyx_f_7hdbscan_12dist_metrics_19MahalanobisDistance_rdist’ from incompatible pointer type [-Wincompatible-pointer-types]
       9641 |   __pyx_t_1 = __pyx_f_7hdbscan_12dist_metrics_19MahalanobisDistance_rdist(((struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *)__pyx_v_self), __pyx_v_x1, __pyx_v_x2, __pyx_v_size); if (unlikely(__pyx_t_1 == ((__pyx_t_7hdbscan_12dist_metrics_DTYPE_t)-1.0))) __PYX_ERR(1, 692, __pyx_L1_error)
            |                                                                           ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            |                                                                            |
            |                                                                            struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *
      hdbscan/dist_metrics.c:9390:170: note: expected ‘struct __pyx_obj_7hdbscan_12dist_metrics_MahalanobisDistance *’ but argument is of type ‘struct __pyx_obj_7hdbscan_12dist_metrics_DistanceMetric *’
       9390 | static __pyx_t_7hdbscan_12dist_metrics_DTYPE_t __pyx_f_7hdbscan_12dist_metrics_19MahalanobisDistance_rdist(struct __pyx_obj_7hdbscan_12dist_metrics_MahalanobisDistance *__pyx_v_self, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x1, __pyx_t_7hdbscan_12dist_metrics_DTYPE_t *__pyx_v_x2, __pyx_t_7hdbscan_12dist_metrics_ITYPE_t __pyx_v_size) {
            |                                                                                                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      error: command '/usr/bin/gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for hdbscan
Failed to build hdbscan
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (hdbscan)
Error: Error installing package(s): "'torch==2.2.0'", "'transformers==4.38.0'", "'huggingface_hub==0.20.0'", "'numpy==1.26.0'", "'pandas==2.0.3'", "'nltk==3.8.1'", "'scikit-learn==1.3.0'", "'datasets==2.16.1'", "'evaluate==0.4.0'", "'accelerate==0.26.0'", "'bertopic==0.16.3'", "'jsonschema==4.19.2'", "'sentence-transformers==2.2.2'", "'flair==0.13.0'", "'umap-learn==0.5.6'", "'hdbscan==0.8.33'", "'scipy==1.10.1'"
OscarKjell commented 2 months ago

Hi, thanks for reproting this. Which text-package version are you using?

(also, here you can see it being build on Unbuntu: https://github.com/OscarKjell/text/actions/runs/10140715536/job/28036469916)

m-pilarski commented 2 months ago

I'm using text v1.2.3 and conda v24.7.1

(Also, my colleague has the same issue on his Arch Linux based system.)

m-pilarski commented 2 months ago

According to https://github.com/scikit-learn-contrib/hdbscan/issues/634, hdbscan cannot be compiled with gcc v14. The Ubuntu version in your test is probably still using gcc v13.

Running the following code fixed the issue for me:

# reticulate::conda_remove("textrpp_condaenv")
reticulate::conda_create(
  envname="textrpp_condaenv",
  python_version="3.9.0", 
)
reticulate::conda_install(
  envname="textrpp_condaenv", 
  packages="gcc_linux-64==13.2.0"
)
text::textrpp_install()
moomoofarm1 commented 2 months ago

According to scikit-learn-contrib/hdbscan#634, hdbscan cannot be compiled with gcc v14. The Ubuntu version in your test is probably still using gcc v13.

Running the following code fixed the issue for me:

# reticulate::conda_remove("textrpp_condaenv")
reticulate::conda_create(
  envname="textrpp_condaenv",
  python_version="3.9.0", 
)
reticulate::conda_install(
  envname="textrpp_condaenv", 
  packages="gcc_linux-64==13.2.0"
)
text::textrpp_install()

Yes, exactly. It depends on the version of GCC compiler in Linux. I wonder if we need to include this or just print a message instead. @OscarKjell