Open doctor3030 opened 3 years ago
I simply upgraded every pip installed package and then it worked. Instructions for doing this are here: https://www.activestate.com/resources/quick-reads/how-to-update-all-python-packages/
Hello All,
I also faced the same issue for few hours while working with the Top2vec llibrary and got it fixed just by restarting the Kernel after installing top2vec[sentence_encoders].
FYI- The kernel I was working on is the Kaggle one but the error was exact the same.
pip install hdbscan --no-build-isolation --no-binary :all:
worked for me!
Man, you're literraly saved my life. Nothing worked in hours and I couldnot detect why Thanks a lot!
I was having the same problem when using efficient == 1.1.1 , the problem that causing this issue was the version of scipy, I installed scipy==1.4.1 with numpy=1.19.5 and tensorflow == 2.4.0, downgrade scipy to 1.4.1
$ pip install hdbscan --no-cache-dir --no-binary :all: --no-build-isolation
Didn't work for me. I am running in a virtual environment (e.g. tensorflow_macos_venv) with Apple's machine learning version of tensorflow. Tensorflow is keeping numpy back to 1.18.5.
Python 3.8.5 (default, Sep 4 2020, 02:22:02)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.23.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import numpy
In [2]: print(numpy.__version__)
1.18.5
In [3]: import tensorflow
In [4]: print(tensorflow.__version__)
2.4.0-rc0
In [5]: import hdbscan
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-3f5a460d7435> in <module>
----> 1 import hdbscan
~/tensorflow_macos_venv/lib/python3.8/site-packages/hdbscan/__init__.py in <module>
----> 1 from .hdbscan_ import HDBSCAN, hdbscan
2 from .robust_single_linkage_ import RobustSingleLinkage, robust_single_linkage
3 from .validity import validity_index
4 from .prediction import (approximate_predict,
5 membership_vector,
~/tensorflow_macos_venv/lib/python3.8/site-packages/hdbscan/hdbscan_.py in <module>
19 from scipy.sparse import csgraph
20
---> 21 from ._hdbscan_linkage import (single_linkage,
22 mst_linkage_core,
23 mst_linkage_core_vector,
hdbscan/_hdbscan_linkage.pyx in init hdbscan._hdbscan_linkage()
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
None of these suggestions worked me for me. PS: I am using Kaggle environment.
@adilosa @lmcinnes if there ever was to be a selected answer for this issue, this would be it. Thanks for this - extremely helpful. You have my respect sir/ma'am.
Will second this! Sharing what worked for me in case it can help someone else:
* The only thing that worked for me with the version pins in my requirements.txt was to install with `--no-build-isolation` * `--no-binary` alone was not able to solve the issue
See below for my
requirements.txt
and relevant Dockerfile section:# requirements.txt tensorflow==1.15.2 numpy==1.18.1 scikit-learn==0.22.1
# Dockerfile RUN python -m pip install --upgrade pip setuptools ADD requirements.txt . RUN pip install -r ./requirements.txt --no-cache-dir RUN pip install hdbscan --no-cache-dir --no-binary :all: --no-build-isolation
This one worked for me. Thank you!
@adilosa @lmcinnes if there ever was to be a selected answer for this issue, this would be it. Thanks for this - extremely helpful. You have my respect sir/ma'am.
Will second this! Sharing what worked for me in case it can help someone else:
* The only thing that worked for me with the version pins in my requirements.txt was to install with `--no-build-isolation` * `--no-binary` alone was not able to solve the issue
See below for my
requirements.txt
and relevant Dockerfile section:# requirements.txt tensorflow==1.15.2 numpy==1.18.1 scikit-learn==0.22.1
RUN python -m pip install --upgrade pip setuptools ADD requirements.txt . RUN pip install -r ./requirements.txt --no-cache-dir RUN pip install hdbscan --no-cache-dir --no-binary :all: --no-build-isolation
This one worked for me. Thank you!
Tried the exact same things, did not work. Getting the same error ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject Also, for the above tensorflow, numpy required should be >1.17, one in a thread above numpy==1.16.0 has been suggested. How to solve this problem? And is there anything that I am missing? As discussed above, python version shouldn't be a problem. I am using 3.6.0
Thanks, Poorva
Hey guys, I got that error today and after my tryings i solved mine. Here the codes. I restarted kernel after installed libraries
!pip install seaborn --user !pip install pandas --user
use python virtual environments and install pip install gensim==3.8.3
@adilosa @lmcinnes if there ever was to be a selected answer for this issue, this would be it. Thanks for this - extremely helpful. You have my respect sir/ma'am.
Will second this! Sharing what worked for me in case it can help someone else:
* The only thing that worked for me with the version pins in my requirements.txt was to install with `--no-build-isolation` * `--no-binary` alone was not able to solve the issue
See below for my
requirements.txt
and relevant Dockerfile section:# requirements.txt tensorflow==1.15.2 numpy==1.18.1 scikit-learn==0.22.1
# Dockerfile RUN python -m pip install --upgrade pip setuptools ADD requirements.txt . RUN pip install -r ./requirements.txt --no-cache-dir RUN pip install hdbscan --no-cache-dir --no-binary :all: --no-build-isolation
This one worked for me. Thank you!
This one also worked for me during reimplementation of my project. In my project, the version of numpy 1.19.5. is required by tensorflow 2.5/2.6. Also save my life!!!!
In a GitHub action for mat_discover using ubuntu-latest running just an import hdbscan
command, I get the following:
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/hdbscan/__init__.py:1: in <module>
from .hdbscan_ import HDBSCAN, hdbscan
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/hdbscan/hdbscan_.py:21: in <module>
from ._hdbscan_linkage import (single_linkage,
hdbscan/_hdbscan_linkage.pyx:1: in init hdbscan._hdbscan_linkage
???
E ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
_____ ERROR collecting mat_discover/tests/test_suggest_next_experiment.py ______
mat_discover/tests/test_suggest_next_experiment.py:7: in <module>
from mat_discover.adaptive_design import Adapt
mat_discover/adaptive_design.py:6: in <module>
from mat_discover.mat_discover_ import Discover, my_mvn
mat_discover/mat_discover_.py:31: in <module>
import hdbscan
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/hdbscan/__init__.py:1: in <module>
from .hdbscan_ import HDBSCAN, hdbscan
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/hdbscan/hdbscan_.py:21: in <module>
from ._hdbscan_linkage import (single_linkage,
hdbscan/_hdbscan_linkage.pyx:1: in init hdbscan._hdbscan_linkage
???
E ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
I've been trying various suggestions from this thread without luck (uninstall and then install numpy, use special flags for hdbscan pip install, etc.). I didn't think I had changed anything, especially since I had a working version 6 days ago. I used a diffchecker to see what the differences between the installed packages were:
Successfully installed ElM2D-0.4.1 ElMD-0.4.8 MarkupSafe-2.0.1 Pygments-2.10.0 alabaster-0.7.12 attrs-21.3.0 babel-2.9.1 bounded-pool-executor-0.0.3 cfgv-3.3.1 chem-wasserstein-1.0.8 colorama-0.4.4 coverage-6.2 crabnet-1.2.1 cycler-0.11.0 cython-0.29.26 dill-0.3.4 dist-matrix-1.0.2 distlib-0.3.4 docutils-0.17.1 filelock-3.4.2 fonttools-4.28.5 hdbscan-0.8.27 identify-2.4.1 imagesize-1.3.0 importlib-resources-5.4.0 iniconfig-1.1.1 ipython-genutils-0.2.0 jinja2-3.0.3 joblib-1.1.0 jsonschema-4.3.2 jupyter-core-4.9.1 kaleido-0.2.1 kiwisolver-1.3.2 llvmlite-0.37.0 markdown-it-py-1.1.0 mat-discover-2.0.0 matplotlib-3.5.1 mdit-py-plugins-0.2.8 myst-parser-0.15.2 nbformat-5.1.3 nodeenv-1.6.0 numba-0.54.1 numpy-1.20.3 packaging-21.3 pandas-1.3.5 pillow-8.4.0 platformdirs-2.4.1 plotly-5.5.0 pluggy-1.0.0 pqdm-0.1.0 pre-commit-2.16.0 psutil-5.8.0 py-1.11.0 pynndescent-0.5.5 pyparsing-3.0.6 pyrsistent-0.18.0 pytest-6.2.5 pytest-cov-3.0.0 python-dateutil-2.8.2 pytz-2021.3 pyyaml-6.0 scikit-learn-1.0.2 scipy-1.7.3 seaborn-0.11.2 six-1.16.0 snowballstemmer-2.2.0 sphinx-4.2.0 sphinxcontrib-applehelp-1.0.2 sphinxcontrib-devhelp-1.0.2 sphinxcontrib-htmlhelp-2.0.0 sphinxcontrib-jsmath-1.0.1 sphinxcontrib-qthelp-1.0.3 sphinxcontrib-serializinghtml-1.1.5 tenacity-8.0.1 threadpoolctl-3.0.0 toml-0.10.2 tqdm-4.62.3 traitlets-5.1.1 umap-learn-0.5.2 virtualenv-20.11.0 zipp-3.6.0
Successfully installed ElM2D-0.4.1 ElMD-0.4.8 MarkupSafe-2.0.1 Pygments-2.11.1 alabaster-0.7.12 attrs-21.4.0 babel-2.9.1 bounded-pool-executor-0.0.3 cfgv-3.3.1 chem-wasserstein-1.0.8 colorama-0.4.4 coverage-6.2 crabnet-1.2.1 cycler-0.11.0 cython-0.29.26 dill-0.3.4 dist-matrix-1.0.2 distlib-0.3.4 docutils-0.17.1 filelock-3.4.2 fonttools-4.28.5 hdbscan-0.8.27 identify-2.4.1 imagesize-1.3.0 importlib-resources-5.4.0 iniconfig-1.1.1 ipython-genutils-0.2.0 jinja2-3.0.3 joblib-1.1.0 jsonschema-4.3.3 jupyter-core-4.9.1 kaleido-0.2.1 kiwisolver-1.3.2 llvmlite-0.37.0 markdown-it-py-1.1.0 mat-discover-2.0.0 matplotlib-3.5.1 mdit-py-plugins-0.2.8 myst-parser-0.15.2 nbformat-5.1.3 nodeenv-1.6.0 numba-0.54.1 numpy-1.20.3 packaging-21.3 pandas-1.3.5 pillow-9.0.0 platformdirs-2.4.1 plotly-5.5.0 pluggy-1.0.0 pqdm-0.1.0 pre-commit-2.16.0 psutil-5.9.0 py-1.11.0 pynndescent-0.5.5 pyparsing-3.0.6 pyrsistent-0.18.0 pytest-6.2.5 pytest-cov-3.0.0 python-dateutil-2.8.2 pytz-2021.3 pyyaml-6.0 scikit-learn-1.0.2 scipy-1.7.3 seaborn-0.11.2 six-1.16.0 snowballstemmer-2.2.0 sphinx-4.2.0 sphinxcontrib-applehelp-1.0.2 sphinxcontrib-devhelp-1.0.2 sphinxcontrib-htmlhelp-2.0.0 sphinxcontrib-jsmath-1.0.1 sphinxcontrib-qthelp-1.0.3 sphinxcontrib-serializinghtml-1.1.5 tenacity-8.0.1 threadpoolctl-3.0.0 toml-0.10.2 tqdm-4.62.3 traitlets-5.1.1 umap-learn-0.5.2 virtualenv-20.13.0 zipp-3.7.0
I then pinned every version that had changed back to its old version, but still was getting the same error. I checked other things like the CPython version (3.8.12
), and all with no luck. A comparable workflow on my local computer on Windows works just fine; not sure why it's failing with GitHub action workflow runners. Haven't tried on WSL (or GitHub actions with a Windows runner), but I'm kind of hitting a wall on this one, especially since I don't actually have access to the computer it's running on (GH actions computer).
We encountered the same error. Minimal reproducible example:
FROM python:3.8
RUN pip install numpy==1.20.3 hdbscan==0.8.27
RUN python -c 'import hdbscan'
This results in:
> [3/3] RUN python -c 'import hdbscan':
#6 0.864 Traceback (most recent call last):
#6 0.864 File "<string>", line 1, in <module>
#6 0.864 File "/usr/local/lib/python3.8/site-packages/hdbscan/__init__.py", line 1, in <module>
#6 0.864 from .hdbscan_ import HDBSCAN, hdbscan
#6 0.864 File "/usr/local/lib/python3.8/site-packages/hdbscan/hdbscan_.py", line 21, in <module>
#6 0.864 from ._hdbscan_linkage import (single_linkage,
#6 0.864 File "hdbscan/_hdbscan_linkage.pyx", line 1, in init hdbscan._hdbscan_linkage
#6 0.864 ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
The only workaround I found was to upgrade to numpy==1.22.0
@dszakallas thanks for the quick response! This probably would work if not for my dependency gridlock. It automatically rolls back to numpy==1.20.3
for me, probably because of the numba
dependency. Maybe I just try to wait it out and ignore that particular GitHub actions workflow for now.. Thank you again, definitely was worth a shot.
The BERTopic package is running into the same issue where pip installs are not working anymore. Indeed, upgrading to a version higher than 1.20.3 seems to work for me. However, setting pyproject.toml
with oldest-supported-numpy
does not fix the issue. For me, it only happens when I want to install hdbscan
together with umap-learn
.
Numpy 1.22.0 was released only a few days ago and it seems that since then this issue appears.This does seem to mean, however, that some releases of numpy may affect the workings of hdbscan even if you use an older version of numpy.
This thread was opened 31st of January whereas NumPy 1.20 was released on the 30th of January. Now we see something similar, NumPy 1.22.0 was released a few days ago and we see the ValueError
issue popping up.
I have no clue what exactly is happening here, but it seems that HDBSCAN is affected whenever numpy makes a new release.
@sgbaird If numba
is the only thing that's bothering you, try downgrade numba
to 0.53 first, then upgrade numpy
to 1.22.0.
https://stackoverflow.com/questions/70148065/numba-needs-numpy-1-20-or-less-for-shapley-import
@swang423 thank you! This did the trick to get my GitHub actions, pip
-based pytest
unit tests back up and running. pip install numba==0.53.* numpy==1.22.0
@MaartenGr
I have the same issue. And numpy=1.22.0
is causing a bug with umap
when u are using cosine distance. So now if hdbscan
is working, umap
is not. If umap
is working, i cannot get hdbscan
to work.
https://github.com/lmcinnes/pynndescent/issues/163
:(
I face the same issue.
I tired installing using --no-cache-dir --no-binary :all: --no-build-isolation
, project.toml
as well but still getting the same error.
python -V ==3.8.10
numpy==1.22.0
umap-learn==0.5.1
hdbscan==0.8.27
but for some wierd reason when I install these packages using conda install
command I'm not getting these error, but this fails on pip install
. Only difference is numpy version(1.20.3)
.
@sgbaird @swang423 @MaartenGr Thanks for sharing all your inputs! This seems to have done for me also pip install numba==0.53.* numpy==1.22.0
when trying to import BERTopic
inside a Jupyter Notebook instance. Topic models are training just fine now.
bertopic 0.9.4 pypi_0
hdbscan 0.8.27 pypi_0
numba 0.53.0 pypi_0
numpy 1.22.0 pypi_0
pip 21.2.4 py39hecd8cb5_0
python 3.9.7 h88f2d9e_1
pyyaml 5.4.1 pypi_0
toml 0.10.2 pypi_0
umap-learn 0.5.2 pypi_0
I also have this problem.
When I got import hdbscan
into the script and try to run the python script, I get the following error:
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
I did some experiments, but found that this seems to be a problem with the hdbscan package itself, and has nothing to do with the version of numpy.
If you used the command pip install hdbscan
to install the hdbscan package in your virtual environment, please uninstall it, and then try to use the command conda install -c conda-forge hdbscan
to reinstall hdbscan.
Hope this one can solve your problem!
The issue turned out to be a fair bit less complex than I had thought 😅 The pypi release does not have yet the oldest-supported-numpy
in its pyproject.toml
. It seems that the master branch does have that fix, so simply using hdbscan from the master branch fixes the issue for me.
@lmcinnes Sorry to tag you like this but it seems that the issue should be solved whenever a new pip version is released. Fortunately, this also means that after that release we will not likely see this issue popping up anymore.
I faced the same issue while working on an anaconda. Then I came out from the conda environment and created a simple venv with python 3.9.7. installed hdbscan using pip , generated the requirements file. I created a fresh conda env and installed hdbscan with the required file. I am able to use it now.
pelog39) u1@ubuntu:~$ cat hdbscan_requirement.txt Cython==0.29.26 hdbscan==0.8.27 joblib==1.1.0 numpy==1.22.0 scikit-learn==1.0.2 scipy==1.7.3 six==1.16.0 threadpoolctl==3.0.0 pip install -r hdbscan_requirement.txt pelog39) u1@ubuntu:~$ python -c 'import hdbscan' (pelog39) u1@ubuntu:~$
for hdbscan to work with pytorch : conda install -c conda-forge hdbscan conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
In my env, numpy==1.21.5 works
@swang423 thank you! This did the trick to get my GitHub actions,
pip
-basedpytest
unit tests back up and running.pip install numba==0.53.* numpy==1.22.0
This worked for me. Below is my env.yml (not in complete) (I had issue numba as well as someone mentioned above). Everything got fixed with below versions
Downgrading to a suitable hdbscan version has helped me here. Use trial and error to find the appropriate version. Following versions worked for me: %pip install hdbscan==0.8.33 %pip install numpy==1.20.3
When I try to import hdbscan I get following error:
`--------------------------------------------------------------------------- ValueError Traceback (most recent call last)