tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.27k stars 1.1k forks source link

mcmc samples fill up memory #436

Open rzu512 opened 5 years ago

rzu512 commented 5 years ago

I run Hamiltonian Monte Carlo on 4 copies of my model for 10^5 steps on a GPU. Each copy of the model contains about 1000 parameters. The log-likelihood function contains tf.scan. The main (cpu) memory was quickly filled up.

Can I just get the value of the parameters that give the largest log-likelihood instead of the whole trace of samples?

cc20002002 commented 5 years ago

I have the same problem. I am fitting a bayesian changepoint model where the samples fill up my 32 gb memory fairly quickly.

TahaSaleh01 commented 4 years ago

Hi, I am facing the same problem. Is this issue resolved in the new tf version..

brianwa84 commented 4 years ago

You can use thinning to get fewer results out. Pass num_steps_between_results to sample_chain. Picking the most likely kind of defeats the purpose of MCMC, which is to get a set of "typical" samples.

On Fri, Aug 21, 2020, 3:57 AM TahaSaleh01 notifications@github.com wrote:

Hi, I am facing the same problem. Is this issue resolved in the new tf version..

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/436#issuecomment-678101554, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI5VZRRL2RSU2TFVAWTSBYSF3ANCNFSM4HRKGKKA .

jcalifornia commented 3 years ago

Is there a way to offload the old samples to system memory, keeping GPU memory free, while running the chain?

brianwa84 commented 3 years ago

There is the notion of a Reducer in tfp.experimental.mcmc (used w/ the WithReductions transition kernel). You could write a py_function that writes out, say, 10 underlying transition kernel samples to disk, then returns only the last one.

Brian Patton | Software Engineer | @.***

On Wed, Mar 17, 2021 at 11:51 AM Josh Chang @.***> wrote:

Is there a way to offload the old samples to system memory, keeping GPU memory free, while running the chain?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/436#issuecomment-801194833, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI54F5JHWUMYEX2ZKV3TEDFXPANCNFSM4HRKGKKA .

chrism0dwk commented 3 years ago

Like @brianwa84 suggests, we tend to just chunk our MCMC, firing off a burst of say 1000 samples at a time and dumping to an HDF5 file. Something like

num_burst_samples = 1000
num_bursts = 50
final_results = None
for i in num_bursts:
    samples, results, final_results = tfp.mcmc.sample_chain(num_burst_samples, 
                                                            current_state=current_state,
                                                            kernel=kernel,
                                                            previous_kernel_results=final_results, 
                                                            return_final_kernel_results=True)
   hdf5_array[(i*num_burst_samples):(i*num_burst_samples + num_burst_samples)] = samples
   current_state = [s[-1] for s in samples]

which works okay, since typically the write outweights the cost of a kernel startup. A better approach would be to write a function that calls tfp.mcmc.sample_chain and decorate with @tf.function with or without experimental_compile=True. The key is to return the final_kernel_results so you can restart the chain for the next chunk from where it left off.

Chris