huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

Using MT-Bench to evaluate zephyr #114

Open abgoswam opened 5 months ago

abgoswam commented 5 months ago

In the Readme.md here, it says :

We should also ensure the word zephyr exists in the --model-id argument

This is because:

  1. in the MT-Bench code, they seem to be passing model_id around. code
  2. They look for word "zephyr" to find the matching adapter . here

This is probably a bug in FastChat.

Nonetheless we should update the README.md here too. Otherwise, people using the alignment-handbook will see low scores on MT-Bench

abgoswam commented 5 months ago

related to : https://github.com/lm-sys/FastChat/issues/3026

fmguler commented 4 months ago

Yeah, also in my experience using the word 'zephyr' in model-id instead of model-path works. In fact, no need to use the word 'zephyr' in model-path at all.