ReaLLMASIC / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
23 stars 17 forks source link

Make n_kv_group 6 by default to enable flash attn #164

Closed gkielian closed 3 months ago