pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

Naming: n_local_heads -> n_kv_heads #162

Open ad8e opened 2 months ago

ad8e commented 2 months ago

n_local_heads refers to TP sharding, rather than GQA.