MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.
https://maartengr.github.io/BERTopic/
MIT License
6.14k stars 765 forks source link

How to create layers for DataMapPlot #1891

Open zilch42 opened 7 months ago

zilch42 commented 7 months ago

Hi Maarten,

I've been following the new datamapplot addition which looks awesome. One of the great things about the interactive version is the ability to use layers that change as you zoom. While I don't think the interactive version has been implemented yet, the concept of layers may also be useful for static plots that have too many topics to visualise.

I'm wondering if BERTopic's hierarchical_topics function I can be used to create 1 or multiple layers at different levels of granularity to help with datamapplot? Is there a way that this could be done?

MaartenGr commented 7 months ago

Thank you for the suggestion. As more hierarchical clustering algorithms are released, I am currently looking into ways to model a hierarchy of topics (like HDBSCAN does) whilst fitting the model. As you mentioned, even if the clustering algorithm does not support this, we could still model this hierarchy with hierarchical_topics.

It should be possible and my intention is to make it possible. However, I have limited time at the moment to spend on this (and also many other issues/requests that are currently open), so it might take a while.

Having said that, it would be an awesome feature to have and I think since BERTopic is already quite modular, so it should be very much possible.

zilch42 commented 7 months ago

Thanks Maarten. I look forward to those developments. I initially had a look at what's going on in visualize_hierarchical_documents and couldn't make much sense of it but if I do figure out a method that utilizes hierarchical_topics I will add it here.