Update table with new benchmark results

Lightning-AI / litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

https://lightning.ai

Apache License 2.0

6.85k stars 726 forks source link

Update table with new benchmark results #1361

Closed awaelchli closed 3 weeks ago

awaelchli commented 3 weeks ago

Adds a new column to the config_hub/finetune/README.md with automated benchmarks as a follow up to #1337. The new column "Multitask score" covers MMLU at the moment. More categories will be added in the future.

The following settings were used to run MMLU:

litgpt evaluate
--checkpoint_dir ...
--batch_size 4
--device cuda
--dtype bfloat16
--tasks mmlu
...

Also removes the "Dataset" and "Precision" columns as they are constant, to make space.