MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.
https://maartengr.github.io/BERTopic/
MIT License
6.19k stars 765 forks source link

Add LiteLLM as a representation model #2213

Open MaartenGr opened 1 week ago

MaartenGr commented 1 week ago

What does this PR do?

Adds LiteLLM as a representation model, which should open up many LLMs to use.

To use this, you will need to install the openai package first:

pip install litellm

Then, get yourself an API key of any provider (for instance OpenAI) and use it as follows:

import os
from bertopic.representation import LiteLLM
from bertopic import BERTopic

# set ENV variables
os.environ["OPENAI_API_KEY"] = "your-openai-key"

# Create your representation model
representation_model = LiteLLM(model="gpt-3.5-turbo")

# Use the representation model in BERTopic on top of the default pipeline
topic_model = BERTopic(representation_model=representation_model)

To do (in a separate PR)

Before submitting

Skar0 commented 6 days ago

Hi! I’m just asking out of curiosity—would it make sense to use the LangChain connection here together with the LangChain representation instead of having a separate implementation altogether? I understand it might not be the preferred approach, but I came across this and thought it might be worth mentioning :slightly_smiling_face:

MaartenGr commented 6 days ago

@Skar0 Thanks for sharing! Although that is a perfectly reasonable approach, LangChain changes its API quite often and has a large set of dependencies to take into account. In contrast, LiteLLM is light in its dependencies and its API shouldn't change much since it adheres to OpenAIs offering.

I figured that a lighter alternative is welcome in this case.