Extend Fx supported models with KV cache

Feature request

I noticed only llama and opt models are supported for FX tracing with KV Cache right now, can I check what is the plan to extend it to more models? Thanks!

Motivation

I would like run fx traced graphmodules for generate(), which uses KV Cache. Right now it works for OPT and LLama, but I would like try on more models.

Your contribution

If someone could point me to the general design pattern to make a model FX supported with KV cache or the lines of changes in modeling_opt.py or modeling_llama.py that made them work, I would be happy to submit PRs to make more models work.

huggingface / transformers