Open Zhangyanbo opened 1 day ago
Hmm, actually we did Mllama quite similar to Idefics so the cache is not initialized by default when "use-cache=True". And yes, I think makes sense to init an empty cache if those are not special models like Gemma with special cache
Until the fix is there you can get pask-kv by passing model(**inputs, past_key_values=DynamicCache(), use_cache=True)
but I see that the model weights will not be loaded proper way for CausalModel. In fact the ConditionalModel can deal with text-only input so for proper logits computation i'd recommend to use the ConditionalModel
:)
With use_cache
we should probably just init a default cache for the user, or we opt for forcing users to pass a cache object
Yes, exactly. I can make a PR for that
System Info
transformers
version: 4.45.2Who can help?
@amyeroberts @ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
I expect to see a
past_key_values
in theoutput
. However, I gotNone
.