Don't left truncate stuff anymore

Left truncation was a terrible idea, idk why I ever thought it made sense.

Some models, in particular unifiedqa-t5-11b, have unusually short context lengths so that a significant fraction (e.g. 20%) of prompts just get truncated from the left, potentially removing important info about the task. This seems to be leading to degraded performance.

This PR fixes the problem by simply skipping examples in extract_hiddens which exceed the max length indicated by the tokenizer.

EleutherAI / elk

Don't left truncate stuff anymore #239