MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.
https://maartengr.github.io/BERTopic/
MIT License
6.09k stars 757 forks source link

BERTopic Concept embedding - Query #444

Closed nsankar closed 2 years ago

nsankar commented 2 years ago

Hello! , I have a query. Is BERTopic a state of the art alternative to VLAC ? If yes, which is the best embedding method to use in BERTopic for finding topics that are more concept and contextual in the documents?. Thanks in advance.

MaartenGr commented 2 years ago

Is BERTopic a state of the art alternative to VLAC ?

No, VLAC is merely a representation of documents built on top of document embeddings. It is not a topic modeling technique like BERTopic so they are not alternatives from one another.

which is the best embedding method to use in BERTopic for finding topics that are more concept and contextual in the documents?

Typically, it is best to go with the default embedding model, namely all-MiniLM-L6-v2. However, if you need a bit more accuracy when representing the documents, all-mpnet-base-v2 works extremely well. Both, by the way, are SentenceTransformer models.

nsankar commented 2 years ago

Got it. Thank you @MaartenGr