Open OnAnd0n opened 3 days ago
When merging the Topic_model (including all data, with outliers) and the Out_Topic_model (consisting only of outliers), the 'Count' of the Topic_model for -1 increases by the number of outliers, instead of effectively concat them.
I have a hard time understanding what you exactly mean here. Could you give an example? Perhaps showcase what is happening and what you would expect to happen?
The Representative_docs are displayed as NaN. => is the only way?
The representative documents are indeed displayed as NaN since merge_models
is also meant for federated learning. If you want representative documents re-calculated, I would advise checking the issues page. I believe there are a number of issues that describe in detail how you can do this.
I would like to utilize 'Merge_Models' in BERTopic to re-cluster the outliers with HDBScan and merge them with the existing topics.
However, there are currently some challenges with the Merge_Models functionality:
When merging the Topic_model (including all data, with outliers) and the Out_Topic_model (consisting only of outliers), the 'Count' of the Topic_model for -1 increases by the number of outliers, instead of effectively concat them.
The Representative_docs are displayed as NaN. => is the only way?
My BERTopic Version is 0.16.3
How can these issues be resolved?