praeclarum / webgpu-torch

Tensor computation with WebGPU acceleration
MIT License
576 stars 15 forks source link

Decrease GPU buffer allocations using custom heap #6

Closed praeclarum closed 1 year ago

praeclarum commented 1 year ago

The heap does give a perf increase for small objects but:

  1. The garbage collector isn't aggressive enough to free buffers from the heap
  2. Kernels end up sharing input and output buffers and this makes the readonly_storage hard to enforce.

I'm keeping the code, but disabling by default.