Can I turn off KV cache?

hkproj / pytorch-llama

LLaMA 2 implemented from scratch in PyTorch

https://www.youtube.com/watch?v=oM4VmoabDAI

MIT License

227 stars 45 forks source link

Open purejomo opened 4 months ago

purejomo commented 4 months ago

Hello,

Could you please advise me on how to disable the KV cache? I would also appreciate any guidance on how to implement this change in code.

Thank you for your assistance.

JAYANDJEAN commented 4 months ago

You can refer to this: https://github.com/JAYANDJEAN/From_Transformer_to_GPTs/blob/main/04_llama2/llama.py I use use_cache to control whether to use the cache, because we don't need to use the cache during training.

purejomo commented 1 month ago

@JAYANDJEAN Thanks It means I can turn off caching by modifying codes?