Guidance Needed on how to implement basic router and evaluate

I have a basic router which checks category of question/prompt e.g. code , math , reasoning etc using GPT3.5 and then I have a pool of 5-6 models for which I have got benchmark scores for associated dataset and i sort them using performance scores on associated benchmark dataset and cost per million token values . We take average of cost and performance .

How to implement it in router bench and compare with zero router and oracle Router Read the paper and have good high level understanding but need some guidance .

understood that we need to implement something like abstract router but this simple router has no model, weights ?

Any guidance/pointers would help.

withmartian / routerbench

Guidance Needed on how to implement basic router and evaluate #6