EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
6.48k stars 1.72k forks source link

Issue with openai completions API - related to logprobs #2287

Open dmakhervaks opened 1 week ago

dmakhervaks commented 1 week ago

Hello, I believe there is a bug in your code on this line:

https://github.com/EleutherAI/lm-evaluation-harness/blob/543617fef9ba885e87f8db8930fbbff1d4e2ca49/lm_eval/models/openai_completions.py#L72

if I am understanding correctly, this should actually return the tokens, which would be accessed by this instead:

tokens = choice["logprobs"]["tokens"][ctxlen:-1]

Here is an example of a return from an openai request:

{ "id": "***", "object": "text_completion", "created": 1725923867, "model": "gpt-3.5-turbo-instruct", "choices": [ { "text": "\n\n", "index": 0, "logprobs": { "tokens": [ "\n\n" ], "token_logprobs": [ -0.8673025 ], "top_logprobs": [ { "\n\n": -0.8673025 } ], "text_offset": [ 18 ] }, "finish_reason": "length" } ], "usage": { "prompt_tokens": 5, "completion_tokens": 1, "total_tokens": 6 } }

baberabb commented 1 week ago

Hi! Thanks for catching that. Would appreciate a PR if you have the bandwidth!

dmakhervaks commented 1 week ago

@baberabb sure will send a PR by tomorrow.

ahmeda14960 commented 1 week ago

thanks for calling this out!