princeton-nlp / LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
https://arxiv.org/abs/2310.06694
MIT License
533 stars 39 forks source link

Metric Scores and NQ Evaluation #41

Closed Spico197 closed 8 months ago

Spico197 commented 8 months ago

Hi there, thanks very much for such an amazing work! Currently, I'm trying to reproduce the results from the paper, but I have the following problems:

Thank you so much for your time and response~

xiamengzhou commented 8 months ago

Hi! Sorry for the late reply:

We have update the script for evaluation nq, boolq and gsm8k here at run_eval.sh. We use nq_open in the harness repo.

We also updated the metrics we use in README.md file.

Hope it helps and feel free to reach out again for any further questions!

Spico197 commented 8 months ago

Thank you very much for the reply and the code update~