M-RewardBench: Evaluating Reward Models in Multilingual Settings

This repository contains the source code for M-RewardBench, a benchmark and toolkit for evaluating reward models in multilingual settings. We translated RewardBench into 23 diverse languages and evaluated several open-source and multilingual LLMs on their chat, safety, and reasoning capabilities. This project was part of Cohere for AI's Expedition Aya 2024, a 6-week open build challenge.

🤗 Dataset | 💬 Presentation | 📚 Documentation | 📄Paper

News

[2024-10-28] We've published our research, M-RewardBench: Evaluating Reward Models in Multilingual Settings, as an arXiv preprint!
[2024-10-20] Added a Translation sub-category to evaluate RM preferences on translation tasks (de<->en, zh<->en). We also improved the translation quality of the benchmark by using the Google Translate API and performing manual filtering and verification.
[2024-08-28] We won Silver Prize in Expedition Aya 2024! We're also releasing the v1 of the multilingual RewardBench on HuggingFace.

Setup and installation

We recommend installing the dependencies inside a virtual environment:

# Create and activate the virtual environment
python -m venv venv
source venv/bin/activate
# Install the dependencies (within venv context)
pip install -r requirements.txt

Note that the rewardbench package requires Python 3.10 and above.

Testing and Development

This codebase contains minimal tests, mostly we test functions that were added or patched from RewardBench. First, you need to install all the development dependencies:

pip install -r requirements-dev.txt

Then, you can run the tests by:

pytest tests/ -v --capture=no
pytest tests/ -m "not api" -v --capture=no  # to ignore tests that make use of third-party APIs

When developing, we format the code using black and isort, to be consistent with the RewardBench codebase. You can automatically format your code by running:

make style

Team Members

The team is composed of Srishti Gureja (@srishti-git1110), Shayekh Bin Islam, (@ShayekhBinIslam), Rishabh Maheshwary (@RishabhMaheshwary), Drishti Sharma (@DrishtiShrrrma), Gusti Winata (@sanggusti), and Lj Miranda (@ljvmiranda921).

Citation

@article{gureja2024m,
  title={M-RewardBench: Evaluating Reward Models in Multilingual Settings},
  author={Gureja, Srishti and Miranda, Lester James V and Islam, Shayekh Bin and Maheshwary, Rishabh and Sharma, Drishti and Winata, Gusti and Lambert, Nathan and Ruder, Sebastian and Hooker, Sara and Fadaee, Marzieh},
  journal={arXiv preprint arXiv:2410.15522},
  year={2024}
}

for-ai / m-rewardbench

readme