0hq / WebGPT

Run GPT model on the browser with WebGPU. An implementation of GPT inference in less than ~1500 lines of vanilla Javascript.
https://kmeans.org
Other
3.61k stars 206 forks source link

Failed experiment: optimizing selectTopK #15

Closed chenglou closed 1 year ago

chenglou commented 1 year ago

Profiler shows that the next bottleneck is in selectTopK, specifically the sorting, since GPT2 mode would make you sort 50k floats each time

The sorting function is O(nlogn). This converts it to O(n) using quickselect instead of the JS engine's sort (e.g. quicksort), since we don't need the entire array sorted (just the top k)

But really, since this only represents 10% of the bottleneck in the screenshot, and log_2(50000) is only 15, this isn't worth it. The code gets complex for no good reason.

Screenshot 2023-04-21 at 21 57 13

(Anonymous being the sort callback)

vercel[bot] commented 1 year ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
web-gpt ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 22, 2023 5:03am
chenglou commented 1 year ago

@0hq imma leave this as reference but close it!