Open myungjin opened 1 month ago
Hey! Thanks for reporting! cc @michaelbenayoun I can reproduce, but no idea how to avoid this. It's also not new, so I think past key value path was not tested!
@ArthurZucker I’ve experienced the same issue. It seems that although the fix on https://github.com/huggingface/transformers/issues/29923 resolves the problem related to cache, in https://github.com/huggingface/transformers/blob/70b07d97cf2c5f61fff55700b65528a1b6845cd2/src/transformers/utils/fx.py#L1052-L1068, we’re still creating dummy_inputs for past_key_values using the old tuple method. I replaced this part with:
elif "past_key_values" in input_name:
if model.config.model_type not in _FX_SUPPORTED_MODELS_WITH_KV_CACHE:
raise NotImplementedError(
f"Symbolic tracing with past_key_values input is not supported yet for the model {model.config.model_type}. Please open an issue or a PR in the Transformers repository if you'd like to see this support added."
)
inputs_dict[input_name] = DynamicCache()
It seems we can now trace Llama. However, I’m not sure if this is the correct way to fix the issue.
I think there are fixes for direct tracing in optimum
, but happy to have something more general here!
@ArthurZucker Thanks for looking into this.
Can you provide a link or pointer to the fixes you refer to?
I went through optimum document.
Examples in the doc still rely on symbolic_trace
, so I don't know how optimum
can help addressing this issue.
The linked PR should help!
System Info
transformers
version: 4.45.1Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
it should run without error.
symbolic_trace()
with["input_ids", "attention_mask"]
runs fine. However, when ["input_ids", "attention_mask", "past_key_values"] is fed as input_names, the error occurs. If using past_key_values is incorrect, it should be warned and aborted before trying to trace the model.While a fix on a related error (https://github.com/huggingface/transformers/issues/29923) is included in the released version, it seems there is still some bug.