I'm trying to reproduce the results you report. I downloaded the model weights from link https://huggingface.co/yahma/alpaca-7b-lora and evaluated them under the framework of lm-evaluation-harness. But I only got 41.7% accuracy on MNLI dataset.
When using lm-evaluation-harness, did you perform other data processing tricks to get 51.6% acc?
Thanks for the great work!
I'm trying to reproduce the results you report. I downloaded the model weights from link https://huggingface.co/yahma/alpaca-7b-lora and evaluated them under the framework of lm-evaluation-harness. But I only got 41.7% accuracy on MNLI dataset.
When using lm-evaluation-harness, did you perform other data processing tricks to get 51.6% acc?