Left truncation was a terrible idea, idk why I ever thought it made sense.
Some models, in particular unifiedqa-t5-11b, have unusually short context lengths so that a significant fraction (e.g. 20%) of prompts just get truncated from the left, potentially removing important info about the task. This seems to be leading to degraded performance.
This PR fixes the problem by simply skipping examples in extract_hiddens which exceed the max length indicated by the tokenizer.
Left truncation was a terrible idea, idk why I ever thought it made sense.
Some models, in particular
unifiedqa-t5-11b
, have unusually short context lengths so that a significant fraction (e.g. 20%) of prompts just get truncated from the left, potentially removing important info about the task. This seems to be leading to degraded performance.This PR fixes the problem by simply skipping examples in
extract_hiddens
which exceed the max length indicated by the tokenizer.