Open michielree opened 1 week ago
This support can also be extended to include open source models like llama, gemma or phi maybe.
Hi @michielree, thanks for the interest! I agree, it would be nice to be interoperable with vLLM. Your proposal definitely aligns with the project direction.
I am not super attached to the liteLLM integration, but I think that wrapping all the LLM providers is a bigger engineering effort right now. People as of now have been using openai, ollama, and gemini. Maybe we can detect (via the user-defined config) if the user wants to point to a liteLLM proxy, and we can use the openai client for those model calls? As demonstrated in the litellm docs: https://docs.litellm.ai/docs/providers/vllm
Open to other design ideas though; feel free to comment any ideas or submit a proof of concept PR. Thank you for taking this on 🙏🏽
Hi @shreyashankar, thanks for your reply. I agree facilitating user to optionally access models through a liteLLM proxy would be easiest to implement now and most flexible in the long run. Advantages are that it decouples the API key management per provider from DocETL, and allows pipelines with steps that use multiple OpenAI compatible providers (e.g. OpenAI and vLLM). Currently, we would need to commit to one of those because we would have to define the api_base
used by litellm.completion
via an environment variable. I'll work on a PR and circle back! Feel free share any comments in the meantime.
Work in progress here for those interested: https://github.com/michielree/docetl/tree/feature/streamline-litellm-proxy
awesome thank you!! let me know if any issues come up.
First of all, many thanks for sharing this project! It's a much better execution of an idea we (University of Groningen, Center for Information Technology) had floating internally. We are currently serving models through vLLM for our research staff and this would be a great tool for them to start designing their on-premise LLM powered workflows. Unfortunately, vLLM configuration options like the API base URL and the API key can't be set through environment variables so an approach similar to the Ollama example would not work for adding vLLM as a provider.
Proposed solution
I would be happy to contribute to a more flexible way of adding LLM providers, particularly for vLLM, but also considering the future addition of other providers. Since the project already uses LiteLLM, I had a few thoughts:
Generalize the configuration options
It would be helpful to do a scan over all the keywords used in in the LiteLLM completion calls per provider (so not just vLLM) and add them as configuration options similar to
default_model
on the pipeline level andmodel
on the step level.Use OpenAI client with configurable options
Instead of having multiple
call_llm
operations spread across the project, it might make sense to use the OpenAI client with configurable parameters for the base URL and API key. For users who wish to use a non-OpenAI provider (e.g., vLLM), we could suggest using a LiteLLM proxy setup, allowing for easier switching between providers without code modifications.Next steps
I'd be happy to discuss further and adjust based on any feedback. Thanks again for the great work on this project!
Best regards,
Michiel van der Ree
University of Groningen, Center for Information Technology