I am trying to use Top2Vec with the pretrained models. universal-sentence-encoder did not work because for some reason, the embedding_batch_size was showing invalid for it so I then had to the sbert model. This was a corpus of just English words so when I tried to load in all-MiniLM-L6-v2, it showed all-MiniLM-L6-v2 is an invalid embedding model. Can anybody tell me why the above issues are happening?
Edit: On further investigation in the Top2Vec.py file, I found that the only acceptable models are
doc2vec
universal-sentence-encoder
universal-sentence-encoder-multilingual
distiluse-base-multilingual-cased
This is in contrast to the 8 models listed in the official API doc where all-MiniLM-L6-v2 is listed as well. Can we know the reason behind this and the models that are actually supported?
I am trying to use Top2Vec with the pretrained models.
universal-sentence-encoder
did not work because for some reason, theembedding_batch_size
was showing invalid for it so I then had to the sbert model. This was a corpus of just English words so when I tried to load inall-MiniLM-L6-v2
, it showedall-MiniLM-L6-v2 is an invalid embedding model.
Can anybody tell me why the above issues are happening?Edit: On further investigation in the Top2Vec.py file, I found that the only acceptable models are
This is in contrast to the 8 models listed in the official API doc where
all-MiniLM-L6-v2
is listed as well. Can we know the reason behind this and the models that are actually supported?