Open vegabook opened 3 months ago
Where can I get the full 700 actual embedding vectors?
The embeddings of the documents are not saved in the model as that would blow up the model's size. If you want the embeddings of the documents, I would advise reading through the best practices.
I see. So when you say, in the best practises that embeddings can be pre-calculated and fed to the model "especially if you want to iterate over parameters", this assumes that you wish to explore parameters other than changing the embedding model, right?
IE: calculate embeddings once, explore clustering, dimensionality reduction etc using same embeddings?
So the phrase should not contain the word "especially", it should read "This process can be very costly, if we want to iterate over [other] parameters"?
Apologies for the semantic pedantry but I just want to make sure I understand what you're saying correctly and that I'm not missing something.
It is indeed meant to say that when you wish to explore parameters other than changing the embedding model, pre-calculating the embeddings is preferred.
I'm passing a list of approximately 700 articles to the default embeddings function as follows:
where d is a simple list of strings (representing approximately 200 word articles each). However when I
topic_model.get_info()
I see only 11 topics, and their associated embeddings using attributetopic_embeddings_
. Where can I get the full 700 actual embedding vectors?