MaartenGr / PolyFuzz

Fuzzy string matching, grouping, and evaluation.
https://maartengr.github.io/PolyFuzz/
MIT License
733 stars 67 forks source link

package incompatibiliy issue #23

Closed craigjurs closed 3 years ago

craigjurs commented 3 years ago

when trying to import polyfuzz

import polyfuzz

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Python 3.7.5

Seems its related to cosine similarity:

/databricks/python/lib/python3.7/site-packages/polyfuzz/models/_tfidf.py in 5 from sklearn.feature_extraction.text import TfidfVectorizer 6 ----> 7 from ._utils import cosine_similarity

I don't see any version conflicts, but I made sure all the supporting packages meet the ones defined in setup.py

Package Version


asn1crypto 1.3.0
backcall 0.1.0
boto3 1.12.0
botocore 1.15.0
bpemb 0.3.3
certifi 2020.6.20
cffi 1.14.0
chardet 3.0.4
click 8.0.1
cloudpickle 1.6.0
cryptography 2.8
cycler 0.10.0
Cython 0.29.15
decorator 4.4.1
Deprecated 1.2.12
docutils 0.15.2
entrypoints 0.3
filelock 3.0.12
flair 0.7
ftfy 6.0.3
future 0.18.2
gdown 3.12.2
gensim 3.8.3
huggingface-hub 0.0.10
hyperopt 0.2.5
idna 2.8
importlib-metadata 3.10.1
ipykernel 5.1.4
ipython 7.12.0
ipython-genutils 0.2.0
Janome 0.4.1
jedi 0.14.1
jmespath 0.10.0
joblib 0.14.1
jupyter-client 5.3.4
jupyter-core 4.6.1
kiwisolver 1.1.0
koalas 1.2.0
konoha 4.6.5
langdetect 1.0.9
lxml 4.6.3
matplotlib 3.4.2
mpld3 0.3
networkx 2.5.1
numpy 1.19.4
overrides 3.1.0
packaging 20.9
pandas 1.0.1
parso 0.5.2
patsy 0.5.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.2.0
pip 20.0.2
polyfuzz 0.3.2
prompt-toolkit 3.0.3
protobuf 3.17.3
psycopg2 2.8.4
ptyprocess 0.6.0
pyarrow 1.0.1
pycparser 2.19
Pygments 2.5.2
pygobject 3.26.1
pyOpenSSL 19.1.0
pyparsing 2.4.6
PySocks 1.7.1
python-apt 1.6.5+ubuntu0.5 python-dateutil 2.8.1
pytz 2019.3
pyzmq 18.1.1
rapidfuzz 1.4.1
regex 2021.4.4
requests 2.25.1
s3transfer 0.3.3
sacremoses 0.0.45
scikit-learn 0.24.2
scipy 1.4.1
seaborn 0.11.1
segtok 1.5.10
sentencepiece 0.1.91
setuptools 45.2.0
six 1.14.0
smart-open 5.1.0
sparse-dot-topn 0.2.9
sqlitedict 1.7.0
ssh-import-id 5.7
statsmodels 0.11.0
tabulate 0.8.9
threadpoolctl 2.1.0
tokenizers 0.9.3
torch 1.8.1
tornado 6.0.3
tqdm 4.61.1
traitlets 4.3.3
transformers 3.5.1
typing-extensions 3.10.0.0
unattended-upgrades 0.1
urllib3 1.25.8
virtualenv 16.7.10
wcwidth 0.1.8
wheel 0.34.2
wrapt 1.12.1
zipp 3.4.1

Has anyone felt with this before, or have any ideas how to fix?

MaartenGr commented 3 years ago

Hmmm, there might be an issue with numpy. Could you try it again with numpy v1.20 or higher? I remember a similar issue with BERTopic that had the same error message.

craigjurs commented 3 years ago

hello, yes thank you - that fixed the import issue.

perhaps in setup --> "numpy>= 1.18.5,<=1.20.0" instead of "numpy>= 1.18.5,<=1.19.4" ?

MaartenGr commented 3 years ago

Numpy was upgraded to the new version to be compatible with the package. If in a new environment, you try out the new version of PolyFuzz, you should have no issues.

If it does not work out, please let me know!