Appending answer_ids to prompt in `eval_needle.py`

shan18 commented 2 months ago

Hi,

In eval_needle.py, I see that the answer_ids are being appended to the input prompt. https://github.com/jzhang38/EasyContext/blob/d6a7f2d74b08fc8049ec4a8146ef245051a669e3/eval_needle.py#L40

Could you please help me understand why this was implemented this way?

Wouldn't that make the model generate output in teacher-forcing mode instead of doing autoregressive decoding?

jzhang38 commented 2 months ago

See https://github.com/jzhang38/EasyContext/issues/8

In short, it is only counted as correct when the highest output logits index of all tokens in the answer span matches the answer.

https://github.com/jzhang38/EasyContext/blob/d6a7f2d74b08fc8049ec4a8146ef245051a669e3/eval_needle.py#L89

(Imagine the model predicts the first answer token wrong but the remaining tokens are correct due to teacher forcing. It is still counted as incorrect)

It is called PPL-based eval and is used to save memory/latency because we only need on forward pass.

shan18 commented 2 months ago

I see. Thanks a lot for the explanation.

jzhang38 / EasyContext

Appending answer_ids to prompt in `eval_needle.py` #19