tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.26k stars 1.1k forks source link

Memory leak when drawing many tfd.GaussianProcess samples? #1478

Open tom-andersson opened 2 years ago

tom-andersson commented 2 years ago

When instantiating many 2-dimensional GPs with exponentiated quadratic kernels and sampling over several thousand points, I'm getting memory errors: ResourceExhaustedError: failed to allocate memory [Op:Mul]

I am able to produce a minimal working example in Google Colab: https://colab.research.google.com/drive/1yOzrWbyyia3zLirXQ6Z76scf20iryP4W#scrollTo=wNzD5JdxqgOX

Just ensure you have Runtime -> Change runtime type -> Hardware accelerator = GPU so that GPUs are used.

I provide the code here as well, for convenience:

import numpy as np
import tensorflow_probability as tfp
tfd = tfp.distributions
tfk = tfp.math.psd_kernels
from tqdm import tqdm

# This raises ResourceExhaustedError after 626 iterations
for i in tqdm(range(1000)):
    foo = tfd.GaussianProcess(
        kernel=tfk.ExponentiatedQuadratic(np.float64(1.), np.float64(1.)),
        index_points=np.random.randn(6_500, 2).astype(np.float64),
        observation_noise_variance=.05**2,
    ).sample(seed=i).numpy()

I thought I could avoid this error by reusing the same GaussianProcess object when drawing samples, but this also ended up raising a ResourceExhaustedError. I guess this suggests the issue is from running sample() many times, rather than instantiating the GP objects. Does this hint at a memory leak occurring?

srvasude commented 2 years ago

Hi, a couple of comments here:

Is the intent to draw a 1000 samples from a GP parameterized by a ExponentiatedQuadratic(1., 1.)?

If so I would create the GP object outside the loop. The GP object gets recreated each time in the loop, which means that the covariance matrix has to be recomputed each time which adds to memory costs. Making one GP outside the loop and calling gp.sample(seed=i) should suffice.

Note that you can also eliminate the loop if you just do gp.sample(1000, seed=23).

You will get back a Tensor of shape [1000, 6500]. Indexing in to the first dimension will give you independent samples back. This should drastically reduce memory and also make things much faster since the samples will be generated in a vectorized fashion.

tom-andersson commented 2 years ago

Hi @srvasude, thanks very much for the response. You might have missed in my OP I said I also had the error when reusing the same GaussianProcess object and only drawing samples in the loop.

However, using the method of passing 1000 to sample did the trick for me! I've updated the Google Colab MWE to demonstrate your solution. I have also kept the code block with the loop that triggers the ResourceExhausedError.

I'll leave this issue open because I still think it would be worth identifying the growing memory cost of running sample multiple times. But if a TensorFlow team member disagrees, feel free to close.

ikarosilva commented 2 years ago

@srvasude

"The GP object gets recreated each time in the loop, which means that the covariance matrix has to be recomputed each time which adds to memory costs. Making one GP outside the loop and calling gp.sample(seed=i) should suffice."

Just curious, why can't the previously computed GP objects be re-claimed by the garbage collectors, since it is now out of scope (was overwritten by the current "foo" instance)? Could this be improved by updating the class destructor somehow? Thank you.