pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

Llama3 8b perf numbers on A100 #166

Closed yanboliang closed 2 weeks ago

yanboliang commented 2 months ago

Perf numbers of Llama3-8B implementation added by https://github.com/pytorch-labs/gpt-fast/pull/158