Infer cache/RoPE weight dtype from output weights

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

BSD 3-Clause "New" or "Revised" License

5.34k stars 484 forks source link

Closed malfet closed 3 months ago

malfet commented 3 months ago

Add dtype argument to precompute_freqs_cis
Infer caches/RoPE weights dtype from output weight dtype in Transformer constructor
This way one can change precision in one place in generate.py and it will be propagated throughout the model