Is there a way to evaluate performance between alpaca and alpaca-lora?

tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware

Apache License 2.0

18.52k stars 2.21k forks source link

Is there a way to evaluate performance between alpaca and alpaca-lora? #147

Open Jeffwan opened 1 year ago

Jeffwan commented 1 year ago

I am just curious whether there's a scientific way to compare the performance between alpaca and alpaca -lora? Does the community have some evaluation scripts to run?

claysauruswrecks commented 1 year ago

@tloen - We can probably use this issue to track adding an eval harness to the pipeline: https://github.com/EleutherAI/lm-evaluation-harness

claysauruswrecks commented 1 year ago

https://github.com/openai/evals

claysauruswrecks commented 1 year ago

https://github.com/bigcode-project/bigcode-evaluation-harness

hujunchao commented 1 year ago

I am also instrested in this issue.

JACKHAHA363 commented 1 year ago

Hi is there update?

gururise commented 1 year ago

Hi is there update?

Added initial squad benchmarks, working on piqa.

ndvbd commented 1 year ago

Also interested