Using MT-Bench to evaluate zephyr

abgoswam commented 5 months ago

In the Readme.md here, it says :

Make sure the word zephyr exists in the --model-path argument when generating the model responses...

We should also ensure the word zephyr exists in the --model-id argument

This is because:

in the MT-Bench code, they seem to be passing model_id around. code
They look for word "zephyr" to find the matching adapter . here

This is probably a bug in FastChat.

Nonetheless we should update the README.md here too. Otherwise, people using the alignment-handbook will see low scores on MT-Bench

abgoswam commented 5 months ago

fmguler commented 4 months ago

Yeah, also in my experience using the word 'zephyr' in model-id instead of model-path works. In fact, no need to use the word 'zephyr' in model-path at all.

huggingface / alignment-handbook

Using MT-Bench to evaluate zephyr #114