lm-sys / RouteLLM

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
Apache License 2.0
3.3k stars 250 forks source link

GPT-4o not found in MODEL_IDS for mf router #7

Closed sapountzis closed 4 months ago

sapountzis commented 4 months ago
  File "/home/andreas/PycharmProjects/RouteLLM/.venv/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/home/andreas/PycharmProjects/RouteLLM/routellm/openai_server.py", line 40, in lifespan
    ROUTERS_MAP[router] = ROUTER_CLS[router](**router_config)
  File "/home/andreas/PycharmProjects/RouteLLM/routellm/routers/routers.py", line 231, in __init__
    self.strong_model_id = MODEL_IDS[strong_model]
KeyError: 'gpt-4o'

I see there is a predefined list of LLMs in the MODEL_IDS dictionary.

Is there a way to specify arbitrary models

For example in my use case I want to use gpt-4o as the strong model and Qwen/Qwen2-72B-Instruct with Together.ai as an inference provider.

Is there a methodology to generate matrix factorization data for any model pair?

KTibow commented 4 months ago

The data is based off of lmsys's 55k dataset, which is a bit old. You would need updated data in order to get a model embedding for it.

iojw commented 4 months ago

@KTibow is correct that we trained the current matrix factorization router on older data, so it doesn't contain gpt-4o.

However, we have also found that the the performance of our routers transfer to newer strong / weak model pairs (https://lmsys.org/blog/2024-07-01-routellm/#generalizing-to-other-models), so you could use gpt-4o and Qwen as the pair used for routing, using the --strong-model and --weak-model flags when serving, and this should have comparable performance.

There is a distinction between the model pair used for training routers (what is specified in the config for matrix factorization) and the model pair used for routing (specified using the flags).

Let me know if this makes sense! Happy to answer any questions.

sapountzis commented 4 months ago

Thanks for the clarification!