Wondering how the D_golden training data is constructed.

lm-sys / RouteLLM

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!

Apache License 2.0

2.78k stars 204 forks source link

Wondering how the D_golden training data is constructed. #10

Closed URRealHero closed 1 month ago

URRealHero commented 1 month ago

Congrats that you've made such a great innovation in agents. I'm working out to reproducing the paper maybe using more data~ But there exists some problems. In your paper, I recognize that $D_golden$ is to select which model gives the right response. However, when they produce the same result, no matter both right or wrong. What is the winner_model? Should I tag them by tie? Or I'll make the $M_weak$ to be the winner. What did you tag in the aug data? I am grateful to hear your answer!

iojw commented 1 month ago

Hi there! Please check out this blog post from our friends at Anyscale that goes into more detail about how we created the golden-labeled dataset: https://www.anyscale.com/blog/building-an-llm-router-for-high-quality-and-cost-effective-responses.

Happy to answer any other questions!

URRealHero commented 1 month ago

Thanks alot