YuchuanTian / DiJiang

[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear attention mechanism.
https://arxiv.org/abs/2403.19928
86 stars 5 forks source link

Wrong Configuration settings in python-2.8/1B #3

Closed 4IK1d closed 3 months ago

4IK1d commented 3 months ago

Seems we got wrong directory name or wrong settings in config.json

config.json under modeling/pythia-2.8B

image

config.json under modeling/pythia-1B

image

Original config.json with pythia-2.8B

image
YuchuanTian commented 3 months ago

Thanks for your kind reminder. I have corrected this mistake.