Closed URRealHero closed 1 month ago
Hi there! Please check out this blog post from our friends at Anyscale that goes into more detail about how we created the golden-labeled dataset: https://www.anyscale.com/blog/building-an-llm-router-for-high-quality-and-cost-effective-responses.
Happy to answer any other questions!
Thanks alot
Congrats that you've made such a great innovation in agents. I'm working out to reproducing the paper maybe using more data~ But there exists some problems. In your paper, I recognize that $D_golden$ is to select which model gives the right response. However, when they produce the same result, no matter both right or wrong. What is the winner_model? Should I tag them by tie? Or I'll make the $M_weak$ to be the winner. What did you tag in the aug data? I am grateful to hear your answer!