HF Class not instantiating correctly (logit vs logits)

DrewGalbraith commented 2 months ago

When running run_eval.sh --> eval_main.py --> eval_suite.py, I get the following error:

AttributeError: 'Tensor' object has no attribute 'logits'. Did you mean: 'logit'?

The _model_call() function in lm_eval/models/huggingface.py outputs a HuggingFace ModelOutputs object which has an attribute logits, but we're getting back a torch.Tensor object and which uses a method logit() to get the same info.

Switching self.model(inps).logits to self.model(inps).logit() works for the models we create, but gives the following error for HF models like Llama-2-7B-Chat:

  File "/home/<USR>/.conda/envs/<ENV>/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 748, in _model_call
    return self.model(inps).logit()
           ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'CausalLMOutputWithPast' object has no attribute 'logit'. Did you mean: 'logits'?

In a pinch, this could be fixed with a try-except statement, but the better way is to figure out why we aren't correctly instantiating a ModelOutput object in the first place.

DrewGalbraith commented 2 months ago

An aside having to do with AutoModel vs AutoModelForCausalLM (see lines 19-27). AutoModel registration seems unecessary. Do we know why we included this at all? It's confirmed to work without it.

DrewGalbraith commented 2 months ago

This is somewhat related. We can't run some of the evaluation suite tasks because we can't decode because we aren't loading a HF tokenizer, rather a sentencepiece one. Look at this line in lm_eval for how a model and its tokenizer are instantiated for the benchmarking.

DRAGNLabs / 301r_retnet

HF Class not instantiating correctly (logit vs logits) #70