tenstorrent / tt-forge-fe

The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their performance and efficiency.
https://docs.tenstorrent.com/tt-forge-fe/
Apache License 2.0
6 stars 1 forks source link

[Llama 3B] Support for attention block (no KV cache) #123

Open nvukobratTT opened 1 month ago

nvukobratTT commented 1 week ago

Moving KV cache support as P1 once we get functional model e2e working: