Open tom-andersson opened 2 years ago
Hi, a couple of comments here:
Is the intent to draw a 1000 samples from a GP parameterized by a ExponentiatedQuadratic(1., 1.)
?
If so I would create the GP object outside the loop. The GP object gets recreated each time in the loop, which means that the covariance matrix has to be recomputed each time which adds to memory costs. Making one GP outside the loop and calling gp.sample(seed=i)
should suffice.
Note that you can also eliminate the loop if you just do gp.sample(1000, seed=23)
.
You will get back a Tensor
of shape [1000, 6500]
. Indexing in to the first dimension will give you independent samples back. This should drastically reduce memory and also make things much faster since the samples will be generated in a vectorized fashion.
Hi @srvasude, thanks very much for the response. You might have missed in my OP I said I also had the error when reusing the same GaussianProcess
object and only drawing samples in the loop.
However, using the method of passing 1000
to sample
did the trick for me! I've updated the Google Colab MWE to demonstrate your solution. I have also kept the code block with the loop that triggers the ResourceExhausedError
.
I'll leave this issue open because I still think it would be worth identifying the growing memory cost of running sample
multiple times. But if a TensorFlow team member disagrees, feel free to close.
@srvasude
"The GP object gets recreated each time in the loop, which means that the covariance matrix has to be recomputed each time which adds to memory costs. Making one GP outside the loop and calling gp.sample(seed=i) should suffice."
Just curious, why can't the previously computed GP objects be re-claimed by the garbage collectors, since it is now out of scope (was overwritten by the current "foo" instance)? Could this be improved by updating the class destructor somehow? Thank you.
When instantiating many 2-dimensional GPs with exponentiated quadratic kernels and sampling over several thousand points, I'm getting memory errors:
ResourceExhaustedError: failed to allocate memory [Op:Mul]
I am able to produce a minimal working example in Google Colab: https://colab.research.google.com/drive/1yOzrWbyyia3zLirXQ6Z76scf20iryP4W#scrollTo=wNzD5JdxqgOX
Just ensure you have Runtime -> Change runtime type -> Hardware accelerator = GPU so that GPUs are used.
I provide the code here as well, for convenience:
I thought I could avoid this error by reusing the same
GaussianProcess
object when drawing samples, but this also ended up raising aResourceExhaustedError
. I guess this suggests the issue is from runningsample()
many times, rather than instantiating the GP objects. Does this hint at a memory leak occurring?