Closed michaelfeil closed 4 months ago
Hi, thanks for the info. If I understand this correctly, it only affects the generate() function? We did not use the generate function in our eval. See https://github.com/jzhang38/EasyContext/issues/19
Ah, gotcha. This should be irrelevant then, you are correct! (We only do 1x prefill)
Checked out your evals!
I think this should affect the generation quality. https://github.com/huggingface/transformers/pull/30380