mazed-dev / truthsayer

Second brain for knowledge workers that retains everything you read online and helps you to back up your communication with your source as easily as you might share an anecdote over a coffee.
MIT License
2 stars 2 forks source link

make tfjs WebGL backend cache textures less aggressively #833

Closed SergNikitin closed 1 year ago

SergNikitin commented 1 year ago

Currently we use WebGL backend for tfjs. Every tf.Tensor under the hood gets represented by a WebGL texture and they are acquired & released by an internal TextureManager class. So TextureManager acts similar to std::allocator.

Turns out that TextureManager leverages a common allocator optimisation - when a tensor is Tensor.dispose()-ed by application code it doesn't deallocate the GPU memory - it puts it in a pool of unused (or "free") textures, so the memory is still not available to the other processes on the user's machine. Then, seemingly, it holds onto the textures indefinitely -- until the whole backend gets killed. When it's time to make a new allocation, it uses one of the "free" textures - but if and only if the size of the new texture matches the size of the free texture exactly.

This is not a big issue for KnnClassifier because it always allocates tensors of size from a small consistent pool (I think) - for example, 2x48. But apparently universal-sentence-encoder allocates tensors based on a size of the input phrase, so it's different most/every time. This allocates tons of different textures of various sizes and then never releases the memory back to the OS.

As a result, even for a small dev test account I get the following stats once after a brief usage:

_numBytesAllocated: 671494752 // ~670MB
_numBytesFree: 643054816 // ~640MB

So in my case I only need ~30MB of GPU memory, but ~670MB was allocated. See the full snapshot of a TextureManager data here

SergNikitin commented 1 year ago

Similar issue has been discussed in https://github.com/tensorflow/tfjs/issues/1440, with some workarounds offered. On the surface it looks to me that it's for a different usage pattern, but haven't tried it yet

Also https://github.com/tensorflow/tfjs/issues/4166