Closed sapountzis closed 4 months ago
The data is based off of lmsys's 55k dataset, which is a bit old. You would need updated data in order to get a model embedding for it.
@KTibow is correct that we trained the current matrix factorization router on older data, so it doesn't contain gpt-4o.
However, we have also found that the the performance of our routers transfer to newer strong / weak model pairs (https://lmsys.org/blog/2024-07-01-routellm/#generalizing-to-other-models), so you could use gpt-4o and Qwen as the pair used for routing, using the --strong-model
and --weak-model
flags when serving, and this should have comparable performance.
There is a distinction between the model pair used for training routers (what is specified in the config for matrix factorization) and the model pair used for routing (specified using the flags).
Let me know if this makes sense! Happy to answer any questions.
Thanks for the clarification!
I see there is a predefined list of LLMs in the MODEL_IDS dictionary.
Is there a way to specify arbitrary models
For example in my use case I want to use gpt-4o as the strong model and Qwen/Qwen2-72B-Instruct with Together.ai as an inference provider.
Is there a methodology to generate matrix factorization data for any model pair?