Creating representations using IBM Watsonx LLMs

andreswagner commented 1 month ago

Hi Maarten,

What do you think if we enable BERTopic to create representation by also using Watsonx hosted LLMs like Llama-3-70b, Mixtral-8x7b, Granite and many others to come?

Let me know your thoughts about it to help with the coding

Best, Andres

MaartenGr commented 1 month ago

Although that is currently not implemented, it might be worthwhile to checkout https://github.com/MaartenGr/BERTopic/pull/1451 since that references a broad API-based package that accesses many LLMs. Is that the type of solution that would work for you? If so, I can check whether that might be the best solution here.

andreswagner commented 1 month ago

I’m familiar with that package as well. While having a standardized way to use LLMs is beneficial, it does have its downsides, such as the potential for dependency issues. It's not just about accessing various models and vendors; it also involves ensuring that the prompts are optimized for each LLM, requiring small adjustments to the default prompts that create the representations. For the end user, the real value lies in using different LLMs without needing to download any additional packages or tweak any prompts to achieve quality results. Therefore, I propose incorporating Watsonx.ai calls through a standard request library to avoid adding dependency conflicts and adjusting the default prompts for each supported LLM and include top sequences to ensure they deliver good performance right out of the box for the end users of BERTopic. That's why I'm proposing to start with Mixtral-8x7b and Llama-3 first.

MaartenGr commented 1 month ago

I’m familiar with that package as well. While having a standardized way to use LLMs is beneficial, it does have its downsides, such as the potential for dependency issues. It's not just about accessing various models and vendors; it also involves ensuring that the prompts are optimized for each LLM, requiring small adjustments to the default prompts that create the representations. For the end user, the real value lies in using different LLMs without needing to download any additional packages or tweak any prompts to achieve quality results. Therefore, I propose incorporating Watsonx.ai calls through a standard request library to avoid adding dependency conflicts and adjusting the default prompts for each supported LLM and include top sequences to ensure they deliver good performance right out of the box for the end users of BERTopic. That's why I'm proposing to start with Mixtral-8x7b and Llama-3 first.

Although I love the idea of not requiring additional installs and prompts that are specifically tuned toward specific LLMs, there is an issue with maintenance that I am not sure how to approach.

Consider this, you create a manual implementation using any request library. That by itself, if you check for example the LiteLLM implementation, can require a significant amount of code which has to be maintained by someone. Similarly, creating these prompts is by itself a lot of work (thank you so much for the offer to create them!) but will need to be updated every time a new model is released which almost happens on a weekly basis.

So if we are going in this route, it creates a lot of code and work that also needs to be maintained and that responsibility falls to the maintainers (just me). This also impacts how other vendors are setup currently. Shouldn't we do the same for Cohere, OpenAI, and others? If the answer is yes, then we are doing something similar to LiteLLM and I want the focus of the work to be on the core functionality of BERTopic.

The easiest way is of course to implement LiteLLM, which gives access to many vendors but as you have mentioned does result in potentially more dependency issues (although I'm not sure if those dependencies actually overlap with the ones in BERTopic).

I'm not sure... What do you think?

andreswagner commented 1 month ago

Hi Maarten,

Well, the beauty of open sourcing your work is that you can have many contributing to keep improving BERTopic. I have been using it since 2021 and I'm happy to become a contributor and getting more people to use it. Regarding LLMs, you're right, is very active right now and every week we see something new gets released. On the other hand, only a few LLMs are available as a service and probably if you try the 80/20 rule, you can count the main Platform Providers with just the fingers of one your hand (check image). Furthermore, most usage comes from a few models that you cant count with the fingers of your other hand. So maybe a interesting strategy might be the following:

Blockbuster: To have a few connectors to the Top 5 Platforms and a few tailored made prompts to the Top 5 Models, with the requirements of having this connectors to only use HTTP request so it doesn't add more dependencies conflicts.
Long tail: In case more connectivity is required, then something like LiteLLM + some custom prompt might solve the issue, with the risk of getting dependencies conflict that might need some special care.

On the other hand, if maintaining the Watsonx connector is something that's bugging you, I'm more than happy to keep helping on keeping it updated at BERTopic: to me is much more expensive to keep a forked version updated and sharing that with the rest of the team.

WhatsApp Image 2024-05-09 at 10 26 25

MaartenGr commented 4 weeks ago

So maybe a interesting strategy might be the following: Blockbuster: To have a few connectors to the Top 5 Platforms and a few tailored made prompts to the Top 5 Models, with the requirements of having this connectors to only use HTTP request so it doesn't add more dependencies conflicts. Long tail: In case more connectivity is required, then something like LiteLLM + some custom prompt might solve the issue, with the risk of getting dependencies conflict that might need some special care.

I think the Long tail should definitely then be there in order to make that all types of models and services are available to the userbase. You would be surprised by how many different LLMs (including fine-tunes) are used with BERTopic. That said, having Blockbuster here would make sure that those who are not familiar with the many of thousands of models out there can still use the most common ones.

One thing to note here is that it should be really clear when something is and isn't a model-tailored prompt.

As most prompts in BERTopic are currently not tailored to specific models, it might be strange to users that they sometimes have to customize the prompts and sometimes not.

I have been using it since 2021 and I'm happy to become a contributor and getting more people to use it.

On the other hand, if maintaining the Watsonx connector is something that's bugging you, I'm more than happy to keep helping on keeping it updated at BERTopic: to me is much more expensive to keep a forked version updated and sharing that with the rest of the team.

That would be great! The maintenance of any additional line of code is tricky to keep up with as an individual developer, so any help here is greatly appreciated 😄

andreswagner commented 3 weeks ago

Hi Maarten,

So what do you think if we start by including the module to have Watsonx called without the need of adding any additional libraries to BERTopic. I have the code already and I've tested it many times with different projects... and is just a few bunch of lines. Are you ok if I do a Pull request with that?

Best, Andres

MaartenGr commented 3 weeks ago

Yes! A PR would definitely be helpful here, thank you. There's a Ruff PR open that might need to be integrated at some point but since you are essentially making a new .py file, you are unlikely to touch existing code much so I don't expect it to be an issue.

MaartenGr / BERTopic

Creating representations using IBM Watsonx LLMs #2001