ddangelov / Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.
BSD 3-Clause "New" or "Revised" License
2.95k stars 374 forks source link

embedding_model is not available #339

Open Keszzz opened 1 year ago

Keszzz commented 1 year ago

Hello everyone. I was using the model a week ago without any problems. Today I am getting an error when trying to load Top2Vec with embedding_model parameter (without it works fine)

I get error like: 2023-07-10 10:30:20,136 - top2vec - INFO - Pre-processing documents for training C:\Users\pkola\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py:528: UserWarning: The parameter 'token_pattern' will not be used since 'tokenizer' is not None' warnings.warn( 2023-07-10 10:30:20,416 - top2vec - INFO - Downloading universal-sentence-encoder-multilingual model ValueError: Trying to load a model of incompatible/unknown type. 'C:\Users\...\Temp\tfhub_modules\26c892ffbc8d7b032f5a95f316e2841ed4f1608c' contains neither 'saved_model.pb' nor 'saved_model.pbtxt'.

I've tried reinstall encoders with pip install top2vec[sentence_encoders] but all libraries are satisfied How to solve this issue?

Keszzz commented 1 year ago

Here is the solution that worked for me:

  1. Error says there is lack of files in location: 'C:\Users...\Temp\tfhub_modules\26c892ffbc8d7b032f5a95f316e2841ed4f1608c (it will be different location for you)
  2. I manually downloaded encoder from "https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3
  3. Unzip tar gz file twice
  4. Drag and drop saved_model.pl to location from 1st point
  5. Drag and drop files from variables folder to variables folder in 1st point location enjoy