declare-lab / instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
https://declare-lab.github.io/instruct-eval/
Apache License 2.0
528 stars 42 forks source link

Reproduce the accuracy of chavinlo/alpaca-native on MMLU #25

Open sglucas opened 1 year ago

sglucas commented 1 year ago

Hi

I try to evaluate the accuracy of chavinlo/alpaca-native on MMLU.

I find the final accuracy is about 36 and I cannot reproduce the result about 41.6.

May I ask which parts I need to focus on, the setup, environments

Best Lucas