Open suprateek-19 opened 1 year ago
That is currently not implemented. However, you can use the internal hdbscan model (concept_model.hdbscan_model
) to extract the probabilities using its approximate_predict
or hdbscan.membership_vector
functions.
I get following error while trying to access above which is a known issue too. Any other way to get probability distribution across concepts for images? AttributeError: 'HDBSCAN' object has no attribute 'approximate_predict'
@shilpiag123 You should use the it as follows:
import hdbscan
probabilities = hdbscan.membership_vector(cluster_model, embeddings)
Having said that, you will have to access the cluster model and also pre-calculate the embeddings. Instead, I would advise using BERTopic v0.15 instead which how now support for topic modeling with images very similar to Concept.
Currently we only get the predicted class through
concept_model.transform()
Can we get the predicted probabilities for each cluster or the top n clusters?