Stan should accumulate the mean and variance for all unknowns

alashworth / test-issue-import

0 stars 0 forks source link

Stan should accumulate the mean and variance for all unknowns #158

Open alashworth opened 5 years ago

alashworth commented 5 years ago

Issue by bgoodri Tuesday Jul 25, 2017 at 00:29 GMT Originally opened as https://github.com/stan-dev/stan/issues/2361

Summary:

RStan has for a long time calculated the mean for all unknowns, even those that are not returned to R. But this (and the variance) should not be the responsibility of the interfaces.

Description:

At each unthinned iteration after warmup, the (constrained) parameters, transformed parameters, and generated quantities should be run through Welford accumulators to build up the posterior means and variances.

I am not sure how best to pass these means and variances back to the calling function when the sampling is done but we can figure that out later.

Reproducible Steps:

Cannot be done currently

Current Output:

Cannot be done currently

Expected Output:

A std::vector<double> of means and a std::vector<double> of variances.

Additional Information:

None

Current Version:

v2.16.0

alashworth commented 5 years ago

Comment by syclik Tuesday Jul 25, 2017 at 01:24 GMT

@bgoodri, do you envision this as a method on the Stan program class? or something that's reported by the sampler?

Now that I think about it, it seems like it should be reported by the sampler. Thoughts?

alashworth commented 5 years ago

Comment by bgoodri Tuesday Jul 25, 2017 at 03:40 GMT

I think writing it somewhere when the sampler is done would be fine.

On Mon, Jul 24, 2017 at 9:24 PM, Daniel Lee notifications@github.com wrote:

@bgoodri https://github.com/bgoodri, do you envision this as a method on the Stan program class? or something that's reported by the sampler?

Now that I think about it, it seems like it should be reported by the sampler. Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317601082, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqrkpfF5eiVvS4yFOWI7m0lUwreOtks5sRUPKgaJpZM4Oh7BW .

alashworth commented 5 years ago

Comment by maverickg Tuesday Jul 25, 2017 at 12:07 GMT

I have an impression that this is in Stan::services now.

On Mon, Jul 24, 2017 at 11:40 PM bgoodri notifications@github.com wrote:

I think writing it somewhere when the sampler is done would be fine.

On Mon, Jul 24, 2017 at 9:24 PM, Daniel Lee notifications@github.com wrote:

@bgoodri https://github.com/bgoodri, do you envision this as a method on the Stan program class? or something that's reported by the sampler?

Now that I think about it, it seems like it should be reported by the sampler. Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317601082, or mute the thread < https://github.com/notifications/unsubscribe-auth/ADOrqrkpfF5eiVvS4yFOWI7m0lUwreOtks5sRUPKgaJpZM4Oh7BW

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317619883, or mute the thread https://github.com/notifications/unsubscribe-auth/AAELlFV0awwJXh8wEzoYrE75FO7_GzLZks5sRWOWgaJpZM4Oh7BW .

alashworth commented 5 years ago

Comment by bob-carpenter Tuesday Jul 25, 2017 at 12:31 GMT

I have an impression that this is in Stan::services now.

Not yet. The question I have is why stop at means and variances? We could also compute split R-hat, ESS, and MCMC standard error.

alashworth commented 5 years ago

Comment by betanalpha Tuesday Jul 25, 2017 at 12:56 GMT

Not with the FFT calculation that we currently use.

I agree that all of these summary stats should be offered by the API as C++ implementations and those implementations should definitely use accumulators for stability and accuracy when appropriate, but I don’t agree that we need to be doing this as we stream the samples out.

On Jul 25, 2017, at 8:31 AM, Bob Carpenter notifications@github.com wrote:

I have an impression that this is in Stan::services now.

Not yet. The question I have is why stop at means and variances? We could also compute split R-hat, ESS, and MCMC standard error. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317722685, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdNlromEAdiZ4m6aGMJPfhuAkOHT4Edks5sReAegaJpZM4Oh7BW.

alashworth commented 5 years ago

Comment by syclik Tuesday Jul 25, 2017 at 14:18 GMT

We could use inversion of control to have the interfaces, at runtime, choose what they want back. It doesn't have to affect the implementation of the algorithms at all.

On Jul 25, 2017, at 8:56 AM, Michael Betancourt notifications@github.com wrote:

Not with the FFT calculation that we currently use.

I agree that all of these summary stats should be offered by the API as C++ implementations and those implementations should definitely use accumulators for stability and accuracy when appropriate, but I don’t agree that we need to be doing this as we stream the samples out.

On Jul 25, 2017, at 8:31 AM, Bob Carpenter notifications@github.com wrote:

I have an impression that this is in Stan::services now.

Not yet. The question I have is why stop at means and variances? We could also compute split R-hat, ESS, and MCMC standard error. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317722685, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdNlromEAdiZ4m6aGMJPfhuAkOHT4Edks5sReAegaJpZM4Oh7BW.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

alashworth commented 5 years ago

Comment by syclik Tuesday Jul 25, 2017 at 14:21 GMT

(This doesn't solve rhat due to fit, but no reason we can't do it for means and variances.)

On Jul 25, 2017, at 10:17 AM, Daniel Lee bearlee@alum.mit.edu wrote:

We could use inversion of control to have the interfaces, at runtime, choose what they want back. It doesn't have to affect the implementation of the algorithms at all.

On Jul 25, 2017, at 8:56 AM, Michael Betancourt notifications@github.com wrote:

Not with the FFT calculation that we currently use.

I agree that all of these summary stats should be offered by the API as C++ implementations and those implementations should definitely use accumulators for stability and accuracy when appropriate, but I don’t agree that we need to be doing this as we stream the samples out.

On Jul 25, 2017, at 8:31 AM, Bob Carpenter notifications@github.com wrote:

I have an impression that this is in Stan::services now.

Not yet. The question I have is why stop at means and variances? We could also compute split R-hat, ESS, and MCMC standard error. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317722685, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdNlromEAdiZ4m6aGMJPfhuAkOHT4Edks5sReAegaJpZM4Oh7BW.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

alashworth commented 5 years ago

Comment by bob-carpenter Tuesday Jul 25, 2017 at 14:38 GMT

@syclik What do you mean by inversion of control in this case? More callbacks passed in to handle accumulation of means and variances or something like that?

And yes, we can't be doing r-hat online---that's way too expensive. And I don't see any reason to return the accumulated means and variances as the program runs. There'll also be the issue that we don't even want to start collecting these things until after warmup.

alashworth commented 5 years ago

Comment by syclik Tuesday Jul 25, 2017 at 17:39 GMT

Exactly. But now I realize we don't even need to expose additional callbacks. We could use a visitor or chain of responsibilities pattern.

As I'm working on the better c++ interface for the interfaces to use, I'm making sure I keep track of iteration number and whether it's warm-up or not.

On Jul 25, 2017, at 10:38 AM, Bob Carpenter notifications@github.com wrote:

@syclik What do you mean by inversion of control in this case? More callbacks passed in to handle accumulation of means and variances or something like that?

And yes, we can't be doing r-hat online---that's way too expensive. And I don't see any reason to return the accumulated means and variances as the program runs. There'll also be the issue that we don't even want to start collecting these things until after warmup.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

alashworth commented 5 years ago

Comment by bgoodri Tuesday Jul 25, 2017 at 21:49 GMT

If the caller is getting the draws of everything, then it doesn't matter very much if the means and variances (or whatever else) are being accumulated because it is a cheap calculation to do them. But if the caller is excluding draws for some unknown (which is long since possible in RStan, maybe PyStan, and should be implemented for CmdStan), then the means and variances cannot be calculated after the fact. So, to provide them, we would have to accumulate them, which is also a cheap calculation. I suppose the mean is the minimal thing we could provide but including the variance seems only slightly less minimal.

On Tue, Jul 25, 2017 at 1:40 PM, Daniel Lee notifications@github.com wrote:

Exactly. But now I realize we don't even need to expose additional callbacks. We could use a visitor or chain of responsibilities pattern.

As I'm working on the better c++ interface for the interfaces to use, I'm making sure I keep track of iteration number and whether it's warm-up or not.

On Jul 25, 2017, at 10:38 AM, Bob Carpenter notifications@github.com wrote:

@syclik What do you mean by inversion of control in this case? More callbacks passed in to handle accumulation of means and variances or something like that?

And yes, we can't be doing r-hat online---that's way too expensive. And I don't see any reason to return the accumulated means and variances as the program runs. There'll also be the issue that we don't even want to start collecting these things until after warmup.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317813067, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqlRGG2uer7n-qL83zEHNiuYEiQiXks5sRiiSgaJpZM4Oh7BW .

alashworth commented 5 years ago

Comment by sakrejda Tuesday Jul 25, 2017 at 22:22 GMT

We could just provide a syntax so that you specify something like 'mean(theta)' when you specify which parameters to save and then a separate writer callback does the accumulating.

On Tue, Jul 25, 2017, 5:49 PM bgoodri notifications@github.com wrote:

If the caller is getting the draws of everything, then it doesn't matter very much if the means and variances (or whatever else) are being accumulated because it is a cheap calculation to do them. But if the caller is excluding draws for some unknown (which is long since possible in RStan, maybe PyStan, and should be implemented for CmdStan), then the means and variances cannot be calculated after the fact. So, to provide them, we would have to accumulate them, which is also a cheap calculation. I suppose the mean is the minimal thing we could provide but including the variance seems only slightly less minimal.

On Tue, Jul 25, 2017 at 1:40 PM, Daniel Lee notifications@github.com wrote:

Exactly. But now I realize we don't even need to expose additional callbacks. We could use a visitor or chain of responsibilities pattern.

As I'm working on the better c++ interface for the interfaces to use, I'm making sure I keep track of iteration number and whether it's warm-up or not.

On Jul 25, 2017, at 10:38 AM, Bob Carpenter notifications@github.com wrote:

@syclik What do you mean by inversion of control in this case? More callbacks passed in to handle accumulation of means and variances or something like that?

And yes, we can't be doing r-hat online---that's way too expensive. And I don't see any reason to return the accumulated means and variances as the program runs. There'll also be the issue that we don't even want to start collecting these things until after warmup.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317813067, or mute the thread < https://github.com/notifications/unsubscribe-auth/ADOrqlRGG2uer7n-qL83zEHNiuYEiQiXks5sRiiSgaJpZM4Oh7BW

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-317882985, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfA6eO7Sz5E1uCfrKbTGSqT5e__m5Fxks5sRmLXgaJpZM4Oh7BW .

alashworth commented 5 years ago

Comment by bob-carpenter Wednesday Jul 26, 2017 at 09:42 GMT

@bgoodri: Are you suggesting that we calculate the means and variances before thinning?

I realize that would provide higher ESS in general, but I think it'll be very misleading in the context of thinning where analyzing the thinned posterior that a user actually saved would provide a different answer than what we report.

alashworth commented 5 years ago

Comment by bgoodri Wednesday Jul 26, 2017 at 14:15 GMT

No, after thinning, which is to say only for the post-warmup iterations that are written out. For parameters whose draws are retained, the accumulated means and variances should match what the user would get if they calculated the means and variances themselves.

On Wed, Jul 26, 2017 at 5:42 AM, Bob Carpenter notifications@github.com wrote:

@bgoodri https://github.com/bgoodri: Are you suggesting that we calculate the means and variances before thinning?

I realize that would provide higher ESS in general, but I think it'll be very misleading in the context of thinning where analyzing the thinned posterior that a user actually saved would provide a different answer than what we report.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2361#issuecomment-318004026, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqrpxvq8ubEnN9y7-1M9Qh8wxIYDrks5sRwnwgaJpZM4Oh7BW .