huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
MIT License
687 stars 78 forks source link

Is there any available example based on lighteval/run_evals_nanotron.py ? #205

Closed JefferyChen453 closed 3 months ago

JefferyChen453 commented 3 months ago

Firstly, thanks for providing this framework. I am following the Fineweb pipeline, using Nanotron for training and Lighteval for evaluation, but met some problems.

1. Following the tiny_llama demo from nanotron, I got one of the checkpoints like this:

image

2. Following run_evals_nanotron.py, I launch the task with: torchrun --nproc-per-node 1 lighteval/run_evals_nanotron.py --checkpoint-config-path nanotron/examples/config_tiny_llama.yaml --lighteval-override lighteval/examples/nanotron/lighteval_config_override_template.yaml

Then I got the error:

image

It seems that there's something to do with the format of .yaml of both lighteval and nanotron-model. Can you provide an example just like you do with accelerate? Thank you.

JefferyChen453 commented 3 months ago

I solved this error by modifying lighteval_config_override_template.yaml, deleting the first line and commenting the recompute_granularity: null. But I wonder why this works, Is this a bug?