ArikReuter / TopicGPT

TopicGPT allows to integrate the benefits of LLMs into Topic Modelling
https://lmu-seminar-llms.github.io/TopicGPT/
MIT License
67 stars 13 forks source link

Language #1

Open MoramiSu opened 8 months ago

MoramiSu commented 8 months ago

Hello, thank you very much for your work! I'd like to ask if there are any language restrictions on topicGPT. Can it handle non-English text?

ArikReuter commented 8 months ago

Hello, thanks for your good question about TopicGPT. In theory, the model should be as multilingual as GPT-3.5 or GPT-4, i.e. it is able to deal with many languages natively. I did not try this out though.

However, you might want to adapt the used prompts slightly into telling the model that it has to expect many languages, but should just output the final topics in one language.

MoramiSu commented 8 months ago

Hello, thanks for your good question about TopicGPT. In theory, the model should be as multilingual as GPT-3.5 or GPT-4, i.e. it is able to deal with many languages natively. I did not try this out though.

However, you might want to adapt the used prompts slightly into telling the model that it has to expect many languages, but should just output the final topics in one language.

I'm wondering if clustering also supports multiple languages. If so, could you please advise on how to configure it?