MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.
https://maartengr.github.io/BERTopic/
MIT License
5.79k stars 721 forks source link

How to use Zero-Shot Classification with Open AI representation? #1548

Open imprateekagarwal opened 9 months ago

imprateekagarwal commented 9 months ago

I would like to assign a topic label from candidate topic if new document is similar to already generated topic labels otherwise would like to assign new labels using Open AI.

MaartenGr commented 9 months ago

You would need to update your prompt template such that it shows all candidate topics with the message to assign topics to documents in the list or create a new label. Do note though that this only works on topic-level since calling OpenAI for individual documents is quite inefficient.

imprateekagarwal commented 9 months ago

But can we handle it in chain of representation e.g.

representation_models = [PartOfSpeech("en_core_web_sm") , KeyBERTInspired() , mmr , zeroshot , openai_generator ]

MaartenGr commented 9 months ago

What the zeroshot gives to the openai generator are the assigned labels through [KEYWORDS], so you can make use of that. Personally, I would skip the zeroshot step and let the openai generator handle that. That would be much more efficient since the openai generator can also do zeroshot.