Open zilch42 opened 1 year ago
It looks as if numba is balking at some type issues, but it is not clear to me what those could be from the error. This has worked acceptably on a few different machines and configurations I tried, so it isn't the default state of things.
What version of numba are you using?
On Wed, Mar 15, 2023 at 7:21 PM zilch42 @.***> wrote:
Hi there, I'm getting an error trying to import HDBSCAN from fast_hdbscan
from fast_hdbscan import HDBSCAN
TypingError Traceback (most recent call last) Cell In[1], line 10 8 from sklearn.feature_extraction.text import CountVectorizer 9 from umap import UMAP ---> 10 from fast_hdbscan import HDBSCAN 11 import pickle 12 import sys
File c:\Users\abb064\AppData\Local\miniconda3\envs\csiro-horizon-scanning39\lib\site-packages\fast_hdbscan__init__.py:7 5 random_state = np.random.RandomState(42) 6 random_data = random_state.random(size=(100, 3)) ----> 7 HDBSCAN(allow_single_cluster=True).fit(random_data) 8 HDBSCAN(cluster_selection_method="leaf").fit(random_data) 10 all = ["HDBSCAN", "fast_hdbscan"]
File c:\Users\abb064\AppData\Local\miniconda3\envs\csiro-horizon-scanning39\lib\site-packages\fast_hdbscan\hdbscan.py:217, in HDBSCAN.fit(self, X, y, fit_params) 207 clean_data = X 209 kwargs = self.getparams() 211 ( 212 self.labels, 213 self.probabilities_, 214 self._single_linkage_tree, 215 self._condensed_tree, 216 self._min_spanning_tree, --> 217 ) = fast_hdbscan(clean_data, return_trees=True, kwargs) 219 self._condensed_tree = to_numpy_rec_array(self._condensed_tree) 221 if not self._all_finite: 222 # remap indices to align with original data in the case of non-finite entries.
File c:\Users\abb064\AppData\Local\miniconda3\envs\csiro-horizon-scanning39\lib\site-packages\fast_hdbscan\hdbscan.py:149, in fast_hdbscan(data, min_samples, min_cluster_size, cluster_selection_method, allow_single_cluster, return_trees) 147 sklearn_tree = KDTree(data) 148 numba_tree = kdtree_to_numba(sklearn_tree) --> 149 edges = parallel_boruvka( 150 numba_tree, min_samples=min_cluster_size if min_samples is None else min_samples 151 ) 152 sorted_mst = edges[np.argsort(edges.T[2])] 153 linkage_tree = mst_to_linkage_tree(sorted_mst)
File c:\Users\abb064\AppData\Local\miniconda3\envs\csiro-horizon-scanning39\lib\site-packages\fast_hdbscan\boruvka.py:270, in parallel_boruvka(tree, min_samples) 267 while n_components > 1: 268 candidate_distances, candidate_indices = boruvka_tree_query(tree, node_components, point_components, 269 core_distances) --> 270 new_edges = merge_components(components_disjoint_set, candidate_indices, candidate_distances, point_components) 271 update_component_vectors(tree, components_disjoint_set, node_components, point_components) 273 edges = np.vstack((edges, new_edges))
File c:\Users\abb064\AppData\Local\miniconda3\envs\csiro-horizon-scanning39\lib\site-packages\numba\core\dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws) 464 msg = (f"{str(e).rstrip()} \n\nThis error may have been caused " 465 f"by the following argument(s):\n{args_str}\n") 466 e.patch_message(msg) --> 468 error_rewrite(e, 'typing') 469 except errors.UnsupportedError as e: 470 # Something unsupported is present in the user code, add help info 471 error_rewrite(e, 'unsupported_error') ... File "..........\AppData\Local\miniconda3\envs\csiro-horizon-scanning39\lib\site-packages\fast_hdbscan\boruvka.py", line 9: def merge_components(disjoint_set, candidate_neighbors, candidate_neighbor_distances, point_components): component_edges = {0: (0, np.int32(1), np.float32(0.0)) for i in range(0)}
Python 3.9.16
— Reply to this email directly, view it on GitHub https://github.com/TutteInstitute/fast_hdbscan/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3IUBIUEQ6USTT5QXR53NTW4JFIDANCNFSM6AAAAAAV4PR6HY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Numba 0.56.4
Here is my whole environment
# Name Version Build Channel
aiofiles 22.1.0 pypi_0 pypi
aiohttp 3.8.4 pypi_0 pypi
aiosignal 1.3.1 pypi_0 pypi
aiosqlite 0.18.0 pypi_0 pypi
anyio 3.6.2 pypi_0 pypi
argon2-cffi 21.3.0 pypi_0 pypi
argon2-cffi-bindings 21.2.0 pypi_0 pypi
arrow 1.2.3 pypi_0 pypi
asttokens 2.2.1 pypi_0 pypi
async-timeout 4.0.2 pypi_0 pypi
attrs 22.2.0 pypi_0 pypi
babel 2.12.1 pypi_0 pypi
backcall 0.2.0 pypi_0 pypi
beautifulsoup4 4.11.2 pypi_0 pypi
bertopic 0.14.1 pypi_0 pypi
bleach 6.0.0 pypi_0 pypi
blis 0.7.9 pypi_0 pypi
bzip2 1.0.8 h8ffe710_4 conda-forge
ca-certificates 2022.12.7 h5b45459_0 conda-forge
cachetools 5.3.0 pypi_0 pypi
catalogue 2.0.8 pypi_0 pypi
certifi 2022.12.7 pypi_0 pypi
cffi 1.15.1 pypi_0 pypi
charset-normalizer 3.1.0 pypi_0 pypi
click 8.1.3 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
comm 0.1.2 pypi_0 pypi
confection 0.0.4 pypi_0 pypi
contourpy 1.0.7 pypi_0 pypi
cycler 0.11.0 pypi_0 pypi
cymem 2.0.7 pypi_0 pypi
cython 0.29.33 pypi_0 pypi
cytoolz 0.12.1 pypi_0 pypi
debugpy 1.6.6 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
defusedxml 0.7.1 pypi_0 pypi
dnspython 2.3.0 pypi_0 pypi
exceptiongroup 1.1.0 pypi_0 pypi
executing 1.2.0 pypi_0 pypi
fast-hdbscan 0.1.0 pypi_0 pypi
fastjsonschema 2.16.3 pypi_0 pypi
filelock 3.9.0 pypi_0 pypi
fonttools 4.39.0 pypi_0 pypi
fqdn 1.5.1 pypi_0 pypi
frozenlist 1.3.3 pypi_0 pypi
hdbscan 0.8.29 pypi_0 pypi
huggingface-hub 0.13.0 pypi_0 pypi
idna 3.4 pypi_0 pypi
importlib-metadata 6.0.0 pypi_0 pypi
importlib-resources 5.12.0 pypi_0 pypi
iniconfig 2.0.0 pypi_0 pypi
ipykernel 6.21.3 pypi_0 pypi
ipython 8.11.0 pypi_0 pypi
ipython-genutils 0.2.0 pypi_0 pypi
ipywidgets 8.0.4 pypi_0 pypi
isoduration 20.11.0 pypi_0 pypi
jedi 0.18.2 pypi_0 pypi
jellyfish 0.9.0 pypi_0 pypi
jinja2 3.1.2 pypi_0 pypi
joblib 1.2.0 pypi_0 pypi
json5 0.9.11 pypi_0 pypi
jsonpointer 2.3 pypi_0 pypi
jsonschema 4.17.3 pypi_0 pypi
jupyter-client 8.0.3 pypi_0 pypi
jupyter-core 5.2.0 pypi_0 pypi
jupyter-events 0.6.3 pypi_0 pypi
jupyter-server 2.4.0 pypi_0 pypi
jupyter-server-fileid 0.8.0 pypi_0 pypi
jupyter-server-terminals 0.4.4 pypi_0 pypi
jupyter-server-ydoc 0.6.1 pypi_0 pypi
jupyter-ydoc 0.2.2 pypi_0 pypi
jupyterlab 3.6.1 pypi_0 pypi
jupyterlab-pygments 0.2.2 pypi_0 pypi
jupyterlab-server 2.20.0 pypi_0 pypi
jupyterlab-widgets 3.0.5 pypi_0 pypi
kiwisolver 1.4.4 pypi_0 pypi
langcodes 3.3.0 pypi_0 pypi
libffi 3.4.2 h8ffe710_5 conda-forge
libsqlite 3.40.0 hcfcfb64_0 conda-forge
libzlib 1.2.13 hcfcfb64_4 conda-forge
llvmlite 0.39.1 pypi_0 pypi
loguru 0.6.0 pypi_0 pypi
markupsafe 2.1.2 pypi_0 pypi
matplotlib 3.7.1 pypi_0 pypi
matplotlib-inline 0.1.6 pypi_0 pypi
mistune 2.0.5 pypi_0 pypi
multidict 6.0.4 pypi_0 pypi
murmurhash 1.0.9 pypi_0 pypi
nbclassic 0.5.3 pypi_0 pypi
nbclient 0.7.2 pypi_0 pypi
nbconvert 7.2.9 pypi_0 pypi
nbformat 5.7.3 pypi_0 pypi
nest-asyncio 1.5.6 pypi_0 pypi
networkx 3.0 pypi_0 pypi
nltk 3.8.1 pypi_0 pypi
notebook 6.5.3 pypi_0 pypi
notebook-shim 0.2.2 pypi_0 pypi
numba 0.56.4 pypi_0 pypi
numpy 1.23.5 pypi_0 pypi
openai 0.27.1 pypi_0 pypi
openssl 3.0.8 hcfcfb64_0 conda-forge
packaging 23.0 pypi_0 pypi
pandas 1.5.3 pypi_0 pypi
pandocfilters 1.5.0 pypi_0 pypi
parso 0.8.3 pypi_0 pypi
pathy 0.10.1 pypi_0 pypi
pickleshare 0.7.5 pypi_0 pypi
pillow 9.4.0 pypi_0 pypi
pip 23.0.1 pyhd8ed1ab_0 conda-forge
platformdirs 3.1.0 pypi_0 pypi
plotly 5.13.1 pypi_0 pypi
pluggy 1.0.0 pypi_0 pypi
preshed 3.0.8 pypi_0 pypi
prometheus-client 0.16.0 pypi_0 pypi
prompt-toolkit 3.0.38 pypi_0 pypi
psutil 5.9.4 pypi_0 pypi
pure-eval 0.2.2 pypi_0 pypi
pycparser 2.21 pypi_0 pypi
pydantic 1.10.6 pypi_0 pypi
pygments 2.14.0 pypi_0 pypi
pymongo 4.3.3 pypi_0 pypi
pynndescent 0.5.8 pypi_0 pypi
pyparsing 3.0.9 pypi_0 pypi
pyphen 0.13.2 pypi_0 pypi
pyrsistent 0.19.3 pypi_0 pypi
pytest 7.2.2 pypi_0 pypi
python 3.9.16 h4de0772_0_cpython conda-forge
python-dateutil 2.8.2 pypi_0 pypi
python-dotenv 1.0.0 pypi_0 pypi
python-json-logger 2.0.7 pypi_0 pypi
pytz 2022.7.1 pypi_0 pypi
pywin32 305 pypi_0 pypi
pywinpty 2.0.10 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
pyzmq 25.0.0 pypi_0 pypi
regex 2022.10.31 pypi_0 pypi
requests 2.28.2 pypi_0 pypi
rfc3339-validator 0.1.4 pypi_0 pypi
rfc3986-validator 0.1.1 pypi_0 pypi
river 0.14.0 pypi_0 pypi
scikit-learn 1.2.1 pypi_0 pypi
scipy 1.10.1 pypi_0 pypi
send2trash 1.8.0 pypi_0 pypi
sentence-transformers 2.2.2 pypi_0 pypi
sentencepiece 0.1.97 pypi_0 pypi
setuptools 67.5.1 pyhd8ed1ab_0 conda-forge
six 1.16.0 pypi_0 pypi
smart-open 6.3.0 pypi_0 pypi
sniffio 1.3.0 pypi_0 pypi
soupsieve 2.4 pypi_0 pypi
spacy 3.5.0 pypi_0 pypi
spacy-legacy 3.0.12 pypi_0 pypi
spacy-loggers 1.0.4 pypi_0 pypi
srsly 2.4.6 pypi_0 pypi
stack-data 0.6.2 pypi_0 pypi
tenacity 8.2.2 pypi_0 pypi
terminado 0.17.1 pypi_0 pypi
textacy 0.12.0 pypi_0 pypi
textblob 0.17.1 pypi_0 pypi
thinc 8.1.9 pypi_0 pypi
threadpoolctl 3.1.0 pypi_0 pypi
tinycss2 1.2.1 pypi_0 pypi
tk 8.6.12 h8ffe710_0 conda-forge
tokenizers 0.13.2 pypi_0 pypi
tomli 2.0.1 pypi_0 pypi
toolz 0.12.0 pypi_0 pypi
topicmodeltuner 0.3.4 pypi_0 pypi
torch 1.13.1 pypi_0 pypi
torchvision 0.14.1 pypi_0 pypi
tornado 6.2 pypi_0 pypi
tqdm 4.65.0 pypi_0 pypi
traitlets 5.9.0 pypi_0 pypi
transformers 4.26.1 pypi_0 pypi
typer 0.7.0 pypi_0 pypi
typing-extensions 4.5.0 pypi_0 pypi
tzdata 2022g h191b570_0 conda-forge
ucrt 10.0.22621.0 h57928b3_0 conda-forge
umap-learn 0.5.3 pypi_0 pypi
uri-template 1.2.0 pypi_0 pypi
urllib3 1.26.14 pypi_0 pypi
vc 14.3 hb6edc58_10 conda-forge
vs2015_runtime 14.34.31931 h4c5c07a_10 conda-forge
wasabi 1.1.1 pypi_0 pypi
wcwidth 0.2.6 pypi_0 pypi
webcolors 1.12 pypi_0 pypi
webencodings 0.5.1 pypi_0 pypi
websocket-client 1.5.1 pypi_0 pypi
wheel 0.38.4 pyhd8ed1ab_0 conda-forge
widgetsnbextension 4.0.5 pypi_0 pypi
win32-setctime 1.1.0 pypi_0 pypi
xz 5.2.6 h8d14728_0 conda-forge
y-py 0.5.9 pypi_0 pypi
yarl 1.8.2 pypi_0 pypi
ypy-websocket 0.8.2 pypi_0 pypi
zipp 3.15.0 pypi_0 pypi
Running into the same error. I have this additional error message:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Cannot unify DictType[int32,Tuple(int64, int32, float32)]<iv=None> and DictType[int64,Tuple(int64, int32, float32)]<iv=None> for 'closure__locals___dictcomp__v167__vphi36_0', defined at C:\Users\redacted\AppData\Local\pypoetry\Cache\virtualenvs\redacted\lib\site-packages\fast_hdbscan\boruvka.py (9)
File "..\..\..\AppData\Local\pypoetry\Cache\virtualenvs\redacted\lib\site-packages\fast_hdbscan\boruvka.py", line 9:
def merge_components(disjoint_set, candidate_neighbors, candidate_neighbor_distances, point_components):
component_edges = {0: (0, np.int32(1), np.float32(0.0)) for i in range(0)}
Python 3.9.0 Numba 0.56.4
Seems like a int32 vs int64 in a dict key problem.
EDIT: found it. Will create a PR
Getting a kinda different import error:
TypingError Traceback (most recent call last)
Cell In[1], line 20
18 import optuna
19 import multiprocessing
---> 20 import fast_hdbscan
File [c:\Users\.venv\Lib\site-packages\fast_hdbscan\__init__.py:7](file:///C:/Users/.venv/Lib/site-packages/fast_hdbscan/__init__.py:7)
5 random_state = np.random.RandomState(42)
6 random_data = random_state.random(size=(100, 3))
----> 7 HDBSCAN(allow_single_cluster=True).fit(random_data)
8 HDBSCAN(cluster_selection_method="leaf").fit(random_data)
10 __all__ = ["HDBSCAN", "fast_hdbscan"]
File [c:\Users\.venv\Lib\site-packages\fast_hdbscan\hdbscan.py:217](file:///C:/Users/.venv/Lib/site-packages/fast_hdbscan/hdbscan.py:217), in HDBSCAN.fit(self, X, y, **fit_params)
207 clean_data = X
209 kwargs = self.get_params()
211 (
212 self.labels_,
213 self.probabilities_,
214 self._single_linkage_tree,
215 self._condensed_tree,
216 self._min_spanning_tree,
--> 217 ) = fast_hdbscan(clean_data, return_trees=True, **kwargs)
219 self._condensed_tree = to_numpy_rec_array(self._condensed_tree)
...
File "..\.venv\Lib\site-packages\fast_hdbscan\boruvka.py", line 9:
def merge_components(disjoint_set, candidate_neighbors, candidate_neighbor_distances, point_components):
component_edges = {0: (0, np.int32(1), np.float32(0.0)) for i in range(0)}
My environment:
alembic==1.11.1
anyio==3.7.1
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.2.1
async-lru==2.0.3
attrs==23.1.0
Babel==2.12.1
backcall==0.2.0
beautifulsoup4==4.12.2
bleach==6.0.0
blis==0.7.9
catalogue==2.0.8
certifi==2023.5.7
cffi==1.15.1
charset-normalizer==3.2.0
click==8.1.4
cmaes==0.10.0
colorama==0.4.6
colorlog==6.7.0
comm==0.1.3
confection==0.1.0
contourpy==1.1.0
coverage==7.2.7
cycler==0.11.0
cymem==2.0.7
Cython==0.29.36
dacite==1.8.1
debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
el-core-news-sm @ https://github.com/explosion/spacy-models/releases/download/el_core_news_sm-3.6.0/el_core_news_sm-3.6.0-py3-none-any.whl#sha256=84babf74d3c42e347b2b5ed809007843e97e7b595ce8f003b7391957d6146a8d
executing==1.2.0
fast-hdbscan==0.1.0
fastjsonschema==2.17.1
filelock==3.12.2
fonttools==4.40.0
fqdn==1.5.1
fsspec==2023.6.0
future==0.18.3
gensim==4.3.1
greenlet==2.0.2
hdbscan==0.8.30
htmlmin==0.1.12
huggingface-hub==0.16.4
idna==3.4
ImageHash==4.3.1
iniconfig==2.0.0
ipykernel==6.24.0
ipython==8.14.0
ipywidgets==8.0.7
isoduration==20.11.0
jedi==0.18.2
Jinja2==3.1.2
joblib==1.3.1
json5==0.9.14
jsonpointer==2.4
jsonschema==4.18.1
jsonschema-specifications==2023.6.1
jupyter-events==0.6.3
jupyter-lsp==2.2.0
jupyter_client==8.3.0
jupyter_core==5.3.1
jupyter_server==2.7.0
jupyter_server_terminals==0.4.4
jupyterlab==4.0.2
jupyterlab-pygments==0.2.2
jupyterlab-vim==0.16.0
jupyterlab-widgets==3.0.8
jupyterlab_server==2.23.0
kiwisolver==1.4.4
kneed==0.8.5
langcodes==3.3.0
llvmlite==0.40.1
Mako==1.2.4
MarkupSafe==2.1.3
matplotlib==3.7.2
matplotlib-inline==0.1.6
mistune==3.0.1
mpmath==1.3.0
multimethod==1.9.1
murmurhash==1.0.9
nbclient==0.8.0
nbconvert==7.6.0
nbformat==5.9.1
nest-asyncio==1.5.6
networkx==3.1
nltk==3.8.1
notebook_shim==0.2.3
numba==0.57.1
numpy==1.24.0
optuna==3.2.0
overrides==7.3.1
packaging==23.1
pandas==2.0.3
pandocfilters==1.5.0
parso==0.8.3
pathy==0.10.2
patsy==0.5.3
phik==0.12.3
pickleshare==0.7.5
Pillow==10.0.0
platformdirs==3.8.1
plotly==5.15.0
pluggy==1.2.0
preshed==3.0.8
prometheus-client==0.17.1
prompt-toolkit==3.0.39
psutil==5.9.5
pure-eval==0.2.2
pycparser==2.21
pydantic==1.10.11
Pygments==2.15.1
pynndescent==0.5.10
pyparsing==3.0.9
pytest==7.4.0
pytest-cov==4.1.0
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3
PyWavelets==1.4.1
pywin32==306
pywinpty==2.0.10
PyYAML==6.0
pyzmq==25.1.0
referencing==0.29.1
regex==2023.6.3
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.8.10
safetensors==0.3.1
scikit-learn==1.3.0
scipy==1.10.1
seaborn==0.12.2
Send2Trash==1.8.2
sentence-transformers==2.2.2
sentencepiece==0.1.99
six==1.16.0
smart-open==6.3.0
sniffio==1.3.0
soupsieve==2.4.1
spacy==3.6.0
spacy-legacy==3.0.12
spacy-loggers==1.0.4
SQLAlchemy==2.0.19
srsly==2.4.6
stack-data==0.6.2
statsmodels==0.14.0
sympy==1.12
tangled-up-in-unicode==0.2.0
tenacity==8.2.2
terminado==0.17.1
thinc==8.1.10
threadpoolctl==3.2.0
tinycss2==1.2.1
tokenizers==0.13.3
torch==2.0.1
torchvision==0.15.2
tornado==6.3.2
tqdm==4.65.0
traitlets==5.9.0
transformers==4.30.2
typeguard==2.13.3
typer==0.9.0
typing_extensions==4.7.1
tzdata==2023.3
umap-learn==0.5.3
uri-template==1.3.0
urllib3==2.0.3
visions==0.7.5
wasabi==1.1.2
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.1
widgetsnbextension==4.0.8
wordcloud==1.9.2
ydata-profiling==4.3.1
Hi there, I'm getting an error trying to import HDBSCAN from fast_hdbscan
from fast_hdbscan import HDBSCAN
Python 3.9.16