add sample script for int8-gemm

0hq / WebGPT

Run GPT model on the browser with WebGPU. An implementation of GPT inference in less than ~1500 lines of vanilla Javascript.

https://kmeans.org

Other

3.61k stars 206 forks source link

add sample script for int8-gemm #31

Closed carsonpo closed 1 year ago

carsonpo commented 1 year ago

Don't have time to add it to your systems in place, but this 3.5x the FLOPs for a very skinny matmul (cached KV inference) and should 4x decrease the model checkpoint size. Need to change it a bit more to add better absmax calculation (probably vectorwise instead of the obviously unoptimal global) but the MAE is very reasonable for the setup shown.

vercel[bot] commented 1 year ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
web-gpt	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 1, 2023 5:25pm

0hq commented 1 year ago

Sweet! What's with the change to params_gpt?