Open lorsanta opened 1 year ago
Thanks for sharing this! I will have to check whether additional sub-selection will be necessary or if a combination of classes might be preferred for some users. Combined with the visualization there might be additional updates necessary.
Hi! I'm working with a multi-label dataset, and I'm trying to use the
topics_per_class
function. However, I noticed that the function only supports single labels. It would be great if the function could support multi-label datasets as well.Maybe by adding an optional argument called
problem_type
, which could be set to either"multi-label"
or"single-label"
, or by just checking theclasses[0]
type to be equal tolist
and change the behavior of the function based on that.Personally to make it work I changed the lines:
https://github.com/MaartenGr/BERTopic/blob/845d423bdef44a4a68fc0b1c9362f97237035d3c/bertopic/_bertopic.py#L769-L772
with