Reproduce the accuracy of chavinlo/alpaca-native on MMLU

declare-lab / instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

https://declare-lab.github.io/instruct-eval/

Apache License 2.0

528 stars 42 forks source link

Open sglucas opened 1 year ago

sglucas commented 1 year ago

I try to evaluate the accuracy of chavinlo/alpaca-native on MMLU.

I find the final accuracy is about 36 and I cannot reproduce the result about 41.6.

May I ask which parts I need to focus on, the setup, environments

Best Lucas