Second brain for knowledge workers that retains everything you read online and helps you to back up your communication with your source as easily as you might share an anecdote over a coffee.
MIT License
2
stars
2
forks
source link
make tfjs WebGL backend cache textures less aggressively #833
Currently we use WebGL backend for tfjs. Every tf.Tensor under the hood gets represented by a WebGL texture and they are acquired & released by an internal TextureManager class. So TextureManager acts similar to std::allocator.
Turns out that TextureManager leverages a common allocator optimisation - when a tensor is Tensor.dispose()-ed by application code it doesn't deallocate the GPU memory - it puts it in a pool of unused (or "free") textures, so the memory is still not available to the other processes on the user's machine. Then, seemingly, it holds onto the textures indefinitely -- until the whole backend gets killed. When it's time to make a new allocation, it uses one of the "free" textures - but if and only if the size of the new texture matches the size of the free texture exactly.
This is not a big issue for KnnClassifier because it always allocates tensors of size from a small consistent pool (I think) - for example, 2x48. But apparently universal-sentence-encoder allocates tensors based on a size of the input phrase, so it's different most/every time. This allocates tons of different textures of various sizes and then never releases the memory back to the OS.
As a result, even for a small dev test account I get the following stats once after a brief usage:
Similar issue has been discussed in https://github.com/tensorflow/tfjs/issues/1440, with some workarounds offered. On the surface it looks to me that it's for a different usage pattern, but haven't tried it yet
Currently we use WebGL backend for tfjs. Every
tf.Tensor
under the hood gets represented by a WebGL texture and they are acquired & released by an internalTextureManager
class. SoTextureManager
acts similar tostd::allocator
.Turns out that
TextureManager
leverages a common allocator optimisation - when a tensor isTensor.dispose()
-ed by application code it doesn't deallocate the GPU memory - it puts it in a pool of unused (or "free") textures, so the memory is still not available to the other processes on the user's machine. Then, seemingly, it holds onto the textures indefinitely -- until the whole backend gets killed. When it's time to make a new allocation, it uses one of the "free" textures - but if and only if the size of the new texture matches the size of the free texture exactly.This is not a big issue for
KnnClassifier
because it always allocates tensors of size from a small consistent pool (I think) - for example,2x48
. But apparentlyuniversal-sentence-encoder
allocates tensors based on a size of the input phrase, so it's different most/every time. This allocates tons of different textures of various sizes and then never releases the memory back to the OS.As a result, even for a small dev test account I get the following stats once after a brief usage:
So in my case I only need ~30MB of GPU memory, but ~670MB was allocated. See the full snapshot of a
TextureManager
data here