Fix off-by-one error in eval

This fixes an error in eval. It is a subtle bug, but really effects the results for tasks with short continuations (and generally makes all results slightly incorrect).

For example, before this fix, if we use boolq as an example. If I use the tokenizer to decode the query I get:

Phantom pain sensations are described as perceptions that an individual experiences relating to a limb or an organ that is not physically part of the body. Limb loss is a result of either removal by amputation or congenital limb deficiency. However, phantom limb sensations can also occur following nerve avulsion or spinal cord injury.\nQuestion: is pain experienced in a missing body part or paralyzed area?\nAnswer:

While decoding the query from the fixed code properly gives:

This is not just an issue with boolq, but effects all tasks. Dropping the last token is clearly wrong.

I also fixed the indexing in a corresponding way within the ICLMetric.

allenai / OLMo

Fix off-by-one error in eval #643