gururise / AlpacaDataCleaned

Alpaca dataset from Stanford, cleaned and curated
Apache License 2.0
1.5k stars 149 forks source link

The MNLI score in lm-evaluation-harness #61

Open wang99711123 opened 1 year ago

wang99711123 commented 1 year ago

Thanks for the great work!

I'm trying to reproduce the results you report. I downloaded the model weights from link https://huggingface.co/yahma/alpaca-7b-lora and evaluated them under the framework of lm-evaluation-harness. But I only got 41.7% accuracy on MNLI dataset.

When using lm-evaluation-harness, did you perform other data processing tricks to get 51.6% acc?