huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
MIT License
687 stars 78 forks source link

Shape mismatches in MMLU #204

Closed marcobellagente93 closed 1 month ago

marcobellagente93 commented 3 months ago

I've setup accelerate and running multi-gpus with

accelerate launch run_evals_accelerate.py --tasks="leaderboard|mmlu:abstract_algebra|0|0" --output_dir "/weka/home-marcob/lighteval/scores" --model_args "pretrained=gpt2"

I get:

Cannot apply desired operation due to shape mismatches. All shapes across devices must be valid. [rank3]: Operation: accelerate.utils.operations.gather` [rank3]: Input shapes: [rank3]: - Process 0: [1024] [rank3]: - Process 1: [2048] [rank3]: - Process 2: [2048] [rank3]: - Process 3: [2048] [rank3]: - Process 4: [2048] [rank3]: - Process 5: [2048] [rank3]: - Process 6: [2048] [rank3]: - Process 7: [2048]

Note that with the same identical setup I can run if I use 5-shots, i.e. the following doesn't throw an error

accelerate launch run_evals_accelerate.py --tasks="leaderboard|mmlu:abstract_algebra|5|0" --output_dir "/weka/home-marcob/lighteval/scores" --model_args "pretrained=gpt2"

The same error happens systematically when running the custom MMLU eval from finewebedu (https://huggingface.co/datasets/HuggingFaceFW/fineweb/blob/main/lighteval_tasks.py#L12).

Also note that without configuring accelerate to explicitly verify tensor shapes no error is thrown and the process just hangs indefinitely

clefourrier commented 3 months ago

Thanks for reporting! We normally have a pad and gather to prevent this (which pads too short splits before gathering if needed). We'll investigate what happens here, as it should apply.

clefourrier commented 2 months ago

Hi! I can't reproduce, can you try to reinstall from main and tell me if you're still getting an error? It would also be nice to provide your detailed accelerate config

clefourrier commented 1 month ago

Closing for inactivity