explosion / sense2vec

🦆 Contextually-keyed word vectors
https://explosion.ai/blog/sense2vec-reloaded
MIT License
1.62k stars 240 forks source link

sense2vec standalone stopped working - BUG? #123

Closed npattarone closed 3 years ago

npattarone commented 3 years ago

Hi! I was having this piece of code working perfectly last week (the very basic standalone example you guys have for sense2vec) and now I'm having the following error:

Traceback (most recent call last):
  File "../server/services/ner.py", line 7, in <module>
    most_similar = s2v.most_similar(query, n=3)
  File "../Library/Python/3.8/lib/python/site-packages/sense2vec/sense2vec.py", line 231, in most_similar
    result_keys, _, scores = self.vectors.most_similar(
  File "vectors.pyx", line 353, in spacy.vectors.Vectors.most_similar
ValueError: could not broadcast input array from shape (1,1187453) into shape (1,0)

The code is as following:

from sense2vec import Sense2Vec

s2v = Sense2Vec().from_disk("../datasources/s2v_reddit_2015_md")
query = "natural_language_processing|NOUN"
assert query in s2v
freq = s2v.get_freq(query)
most_similar = s2v.most_similar(query, n=3)

Any ideas!? Please help ASAP 😔

Thanks a lot!!

honnibal commented 3 years ago

sense2vec hasn't had any releases recently, so I think it must be a question of the other dependencies. So if you have a look at the dependencies with pip list you might be able to figure out what changed.

You should also check whether there are any data files that aren't being found? I don't think that error would result, but I'm not 100% sure. The fact that it's trying to reshape into (1, 0) looks suspicious to me.

npattarone commented 3 years ago

Thanks @honnibal!! Ok, so this is the list (removed old interpreter environment and created a new one with virtualenv, using PyCharm here):

Package         Version
--------------- ---------
blis            0.4.1
catalogue       1.0.0
certifi         2020.6.20
chardet         3.0.4
click           7.1.2
cymem           2.0.3
elasticsearch   7.9.1
fastapi         0.61.1
h11             0.9.0
httptools       0.1.1
idna            2.10
murmurhash      1.0.2
numpy           1.19.2
pandas          1.1.2
pip             20.2.3
plac            1.1.3
preshed         3.0.2
pydantic        1.6.1
python-dateutil 2.8.1
pytz            2020.1
requests        2.24.0
scipy           1.5.2
sense2vec       1.0.2
setuptools      50.3.0
six             1.15.0
spacy           2.3.2
srsly           1.0.2
starlette       0.13.6
thinc           7.4.1
tqdm            4.48.2
urllib3         1.25.10
uuid            1.30
uvicorn         0.11.8
uvloop          0.14.0
wasabi          0.8.0
websockets      8.1

Still getting the error... have no idea what could have happened, I did not make any changes, just re open the project and started to fail 😓

npattarone commented 3 years ago

Nevermind! Downloaded Reddit 2015 vectors again and stopped failing. Thanks for the help :)