Closed peter-pogorelov closed 5 years ago
Hi, thank you for the issue. I was already contacted and the issue should now be resolved.
Make sure to upgrade to the latest version by pip install -U fse or by building from the master branch, as I've just released 0.1.15.
If the issue persists, please feel free to contact me again.
The following code throws an error (TypeError: Cannot convert numpy.float32 to numpy.ndarray):
fb = load_facebook_model(path_to_model)
model = SIF(fb, alpha=1e-7, components=1)
model.train([IndexedSentence(s, i) for i, s in enumerate(sentences)])
this line >> model.sv.similar_by_sentence(['документы', 'бухгалтерия'], model=model, indexable=sentences)
However, if we replace the model with vectors, everything seems alright.
ft = KeyedVectors.load_word2vec_format(path_to_vectors)
model = SIF(ft, alpha=1e-7, components=1)
model.train([IndexedSentence(s, i) for i, s in enumerate(sentences)])
model.sv.similar_by_sentence(['документы', 'бухгалтерия'], model=model, indexable=sentences)
This problem is really important since word counts (ft.wv.vocab) from vectors look like they were automatically recovered from vectors using cosine similarity (not sure about that) and they are not the same as from the model.