Open rashigupta8496 opened 4 weeks ago
It should be doable with that few documents as 10K is generally not that much. It seems to relate to HDBSCAN but I haven't seen this issue before. How much RAM do you have available? Also, it might be helpful to start from a fresh environment with a clean install of all packages.
Thanks for your response! Installed clean but still shows the error. Available RAM: 217231130624 I am not using HDBSCAN, simply running the following:
topic_model = BERTopic(
embedding_model="thenlper/gte-small",
min_topic_size=50,
representation_model=KeyBERTInspired()
)
topics, _ = topic_model.fit_transform(docs)
Curious if there is any way to keep adding to a fitted topic_model?
Thanks for your response! Installed clean but still shows the error. Available RAM: 217231130624
How much is that in GB? Also, it might be helpful to start from a completely fresh environment and install the latest version of BERTopic in order to prevent any issues with previously installed packages.
Curious if there is any way to keep adding to a fitted topic_model?
There is, you can either use online topic modeling or the .merge_models
functionality.
Hi, I am running the BERTopic on 10k sentences it works fine but if I am running it on 15k it shows the following error(I need to run it for at least 50k sentences):
Is there a way to add more docs on an already fitted topic_model? Thanks!