Open xingyaoww opened 1 month ago
OpenHands started fixing the issue! You can monitor the progress here.
@xingyaoww - also see #4109 where litellm's Router is being incorporated and also a config structure that could maybe used here
OpenHands started fixing the issue! You can monitor the progress here.
An attempt was made to automatically fix this issue, but it was unsuccessful. A branch named 'openhands-fix-issue-4184' has been created with the attempted changes. You can view the branch here. Manual intervention may be required.
Quick point of discussion: do we want to implement this within OpenHands? Or should we host a server with the router, like we host our proxy server for All Hands AI?
Personally I think the latter might be better. Doing this on the client side means that users have to acquire several different API keys and somehow configure them. This seems like a pain UI-wise, especially given that currently our configuration behavior is hard to understand: https://github.com/All-Hands-AI/OpenHands/issues/3220
Good point - but another thing is it might be tricky to calculate costs (especially with all the prompt caching and stuff.. for the router than :(.
Another potential idea is to do this with LiteLLM router 🤔 https://docs.litellm.ai/docs/routing#advanced---routing-strategies-%EF%B8%8F
Yeah, maybe NotDiamond could be implemented as a custom routing strategy within the LiteLLM proxy?
yeah seems like a better approach (if we can get the cost propagation to work correctly). Close this for now then
Hi @xingyaoww @neubig, just caught this issue.
While our LLMConfigs accept prices, they only help tune cost tradeoffs. You won't have to provide that parameter for public models - we track prices for every model we support.
Beyond this, we're also happy to help you set up a routing integration with Not Diamond's API. Just let me know if that interests you.
As for LiteLLM, we've actually been discussing an integration with them since July! While waiting on their feedback, we've also implemented a simple integration in our Python client which might help you.
Thanks @acompa , I do think we'd be interested in at least running an evaluation where we use NotDiamond as a backend and see if the results are better/cheaper than what we get now. If your API offers OpenAI compatible endpoints it should be pretty easy (we haven't looked super-carefully yet).
Thanks @acompa , I do think we'd be interested in at least running an evaluation where we use NotDiamond as a backend and see if the results are better/cheaper than what we get now. If your API offers OpenAI compatible endpoints it should be pretty easy (we haven't looked super-carefully yet).
We do accept OpenAI-style requests with messages
at our model_select
endpoint. We're not a proxy, though, so at the moment we only support create
via a Langchain integration.
Cool, thanks! I'll re-open this as I think that whatever way we implement it'd be interesting to see if model routing helps.
Excellent. As you begin your evaluation, note that we offer two approaches to AI model routing:
Our out-of-the-box router has been trained on generalist, cross-domain data (including coding and non-coding tasks) to provide a strong "multi-model" multidisciplinary experience.
Secondly, OpenHands focuses on development applications, and so you might benefit from specialized routing trained on the distribution of your proprietary data. We also offer custom routing to serve these types of domain-targeted use cases as a higher-performance option beyond out-of-the-box routing.
We're happy to answer questions or support you in whichever of these approaches you evaluate.
@neubig we could also look into the https://github.com/Not-Diamond/RoRF/ repo to start with (pair-wise routing) to start with?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
I think the NotDiamond folks are working on this still.
What problem or use case are you trying to solve?
Not Diamond intelligently identifies which LLM is best-suited to respond to any given query. We want to implement a mechanism in OpenHands to support this type of "LLM" selector.
Describe the UX of the solution you'd like
Ideally, use should define a "LLMRouter" as a special type of LLM with some special configs (e.g., multiple keys for different providers). And user can just put in keys, and select that router, and OpenHands will automatically use that going forward.
Do you have thoughts on the technical implementation?
Modify https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/llm/llm.py, as well as config related files under https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/core/config.
You should probably use
model_select
(from notdiamond API) rather thancreate
to be compatible with existing LiteLLM calls.Describe alternatives you've considered
Additional context
Here's the documentation from NotDiamond