Open faaany opened 4 months ago
I made a possible fix suggestion in this PR draft: https://github.com/huggingface/transformers/pull/32039. But I am not sure whether this is correct. So I also filed this issue.
cc @gante too
Incompatibility also affecting Gemma2 with flash-attn, as it doesn't support dynamic cache
System Info
transformers
version: 4.43.0.dev0Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
fails with
And the right padding test case also fails:
Expected behavior
Either we don't test
flash_attention
in this case, or we should add a if check to skip settingcache_implementation
tostatic
.