YuchuanTian / DiJiang

[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear attention mechanism.
https://arxiv.org/abs/2403.19928
86 stars 5 forks source link

Llama 7B? #2

Open pharaouk opened 3 months ago

pharaouk commented 3 months ago

Hi, any chance you can upload modeling file for Llama 7B equivalent DiJiang?

HantingChen commented 2 months ago

Hi, It's in the plans, but no set timeline yet. Thanks for your interest!

pharaouk commented 2 months ago

Ok thanks, can you please share the modeling code until then? It would be helpful for our research