Closed DrewGalbraith closed 7 months ago
for reproducibility, please tell versions of lm_eval and transformers also.
And also please check your pipeline with latest clone of this repo. With latest repo this log_softmax call is at different place https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/huggingface.py#L1045 and should be using logits attribute always: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/huggingface.py#L760
With latest release v0.4.2 logits are always used too: https://github.com/EleutherAI/lm-evaluation-harness/blob/v0.4.2/lm_eval/models/huggingface.py#L748
Hi! In addition to what @LSinev said, sharing the exact HuggingFace model this does or does not error on would also be helpful.
To my knowledge, since at least v0.4.0 .logits
should always be used for models.
Thanks for getting back to me so fast!
@LSinev mamba list
shows lm_eval==0.4.0, torch==2.1.2, and transformers==4.36.2,
@haileyschoelkopf, the model is Llama-2-7b-chat-hf.
@LSinev Looking at lm_eval v0.4.2 lines you referenced, I'm thinking the newer verison will probably work. I was able to find those same lines in v0.4.0 in the repo online, but looking at our downloaded version of 0.4.0, line 507 doesn't have the .logits returned, just the model output. That's likely where the error is. Either our version of the repo was some pre-logits-commit or one of our team deleted the attribute for some reason. 🫣
After adding the attribute to that line, I have confirmed that the benchmark runs now with this model, both for mmlu and for good measure hellaswag.
Thanks for the pointer!
Problem Description
While running
lm_eval.simple_eval(...)
, I'm getting the following error:AttributeError: 'CausalLMOutputWithPast' object has no attribute 'log_softmax'
(Full traceback below)It gets up to
<TIME_STAMP> INFO [evaluator.py:314] Running loglikelihood requests
and then breaks. The error is that some models I run through break whenF.log_softmax(...)
because they access the LM output object (i.e., CausalLmOutputWithPast%20outputs.-,CausalLMOutputWithPast,-class%20transformers.modeling_outputs)), not its logits attribute. This doesn't seem to happen to all models, and I am having trouble figuring out what kind of models don't cause this! I have fixed this twice in the last few weeks by adding a try-except statements, once to the LM-Eval huggingface.py and once to PyTorch'storch/nn/functional.py
. This try-except statement has the basic form:This seems like a shoddy solution, so I'm looking for something more permanent. The try-except, btw, makes it compatible with the models that don't err out with this.
Reproducible example:
Some info:
Full traceback: