clembench / clembench-runs

All outputs generated by running the benchmark on different versions
MIT License
0 stars 5 forks source link

Adding results of recent 1.5 runs #7

Closed Gnurro closed 3 months ago

Gnurro commented 3 months ago

Models: WizardLM-70b-v1.0, tulu2-dpo-70b, vicuna-33b-v1.3 and sheep-duck-llama-2-70b-v1.1 NOTE: referencegame may need to be re-run with updated game code!