stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.6k stars 370 forks source link

return variational parameters at solution #1786

Closed bob-carpenter closed 8 years ago

bob-carpenter commented 8 years ago

Summary:

Right now, there's no way to get the posterior mean fit by ADVI other than by averaging the samples.

Description:

Include a method to return the posterior mean as fit by ADVI.

Current Version:

v2.9.0

dustinvtran commented 8 years ago

Relevant stan-users thread.

Oops, sorry I should have been clearer when responding. The mean parameters of ADVI's Gaussian approximation are written to the CSV file and appropriately transformed to the constrained space (see here). It's written on the first line of the CSV after all the meta comments. However, RStan skips this line in its own output which is what John asked about.

I'd like to keep this issue open though because I think ADVI should also output its standard deviation parameters. Most generally, it should output its variational parameters whenever we get around to implementing more expressive/non-Gaussian variational families.

bob-carpenter commented 8 years ago

Won't that confuse users who try to use the CSV file to do MCMC?

On Mar 3, 2016, at 3:49 PM, Dustin Tran notifications@github.com wrote:

Relevant stan-users thread.

Oops, sorry I should have been clearer when responding. The mean parameters of ADVI's Gaussian approximation are written to the CSV file and appropriately transformed to the constrained space (see here). It's written on the first line of the CSV after all the meta comments. However, RStan skips this line in its own output which is what John asked about.

I'd like to keep this issue open though because I think ADVI should also output its standard deviation parameters. Most generally, it should output its variational parameters whenever we get around to implementing more expressive/non-Gaussian variational families.

— Reply to this email directly or view it on GitHub.

dustinvtran commented 8 years ago

Hence your comment last year https://github.com/stan-dev/stan/issues/1661

akucukelbir commented 8 years ago

@dustinvtran do you want ADVI to output "its standard deviation parameters" in the unconstrained space, or in the constrained space?

the idea of drawing samples from the ADVI approximation to the posterior was to make all downstream processing (like reporting means, quantiles, etc.) similar to MCMC.

dustinvtran commented 8 years ago

Generally, I think all variational parameters should be outputted to the CSV (as a comment) and analogously, all variational parameters should be able to be specified for initialization. This is so that we can re-initialize the variational distribution using a previous VI run and not only the mean as we do now. This extends generally to scenarios where the variational family is not parameterized explicitly by moments (e.g. mean/variance).

For this to happen, I think either the unconstrained or constrained space is fine—whichever is the convention for initializing, e.g., L-BFGS, so that we don’t need to apply an additional function for mapping to the right space.

Dustin

On Mar 4, 2016, at 8:07 AM, Alp Kucukelbir notifications@github.com wrote:

@dustinvtran https://github.com/dustinvtran do you want ADVI to output "its standard deviation parameters" in the unconstrained space, or in the constrained space?

the idea of drawing samples from the ADVI approximation to the posterior was to make all downstream processing (like reporting means, quantiles, etc.) similar to MCMC.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1786#issuecomment-192276387.

bob-carpenter commented 8 years ago

Internally, everything for parameters is done on the unconstrained space. So when users provide constrained inits in the MCMC and optimization commands, they are transformed to unconstrained. I'd say reporting the unconstrained parameters is best --- the compiled Stan program class provides the means to translate back and forth and this is exposed at least in RStan.

Data, on the other hand, always lives on its own scale.

On Mar 4, 2016, at 2:24 PM, Dustin Tran notifications@github.com wrote:

Generally, I think all variational parameters should be outputted to the CSV (as a comment) and analogously, all variational parameters should be able to be specified for initialization. This is so that we can re-initialize the variational distribution using a previous VI run and not only the mean as we do now. This extends generally to scenarios where the variational family is not parameterized explicitly by moments (e.g. mean/variance).

For this to happen, I think either the unconstrained or constrained space is fine—whichever is the convention for initializing, e.g., L-BFGS, so that we don’t need to apply an additional function for mapping to the right space.

Dustin

On Mar 4, 2016, at 8:07 AM, Alp Kucukelbir notifications@github.com wrote:

@dustinvtran https://github.com/dustinvtran do you want ADVI to output "its standard deviation parameters" in the unconstrained space, or in the constrained space?

the idea of drawing samples from the ADVI approximation to the posterior was to make all downstream processing (like reporting means, quantiles, etc.) similar to MCMC.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1786#issuecomment-192276387.

— Reply to this email directly or view it on GitHub.

davharris commented 6 years ago

Has this issue been solved completely, or just partially? Previously, @dustinvtran pointed out that there was no way to get the variational parameters for standard deviations, and that it would be good if these could be made available (below). If that's still true, could this issue be re-opened? (Alternatively, I could file a new issue, if that's easier).

Thanks, and sorry if I'm just missing something.

I'd like to keep this issue open though because I think ADVI should also output its standard deviation parameters. Most generally, it should output its variational parameters whenever we get around to implementing more expressive/non-Gaussian variational families.

bob-carpenter commented 6 years ago

The posterior draws take the variational approximation on the unconstrained scale, sample from it, then transform the draws back to the constrained space to produce a constrained sample. At that point, you can use the constrained sample to estimate the posterior std deviation in the constrained space if that's what you want to do.

If only intervals are required, that can be done by transforming their boundaries. But I'm not sure how you'd calculate std deviation in the constrained space given only std deviation in the unconstrained space. It's easy for things like the log, but I don't know how to do that for simplexes or correlation matrices.

davharris commented 6 years ago

Thanks for your quick response. I don't think I was clear. The parameters I'm interested in happen to be unconstrained, so I think that my question is scale-independent.

My understanding is that Stan optimizes the means and standard deviations, then samples from the resulting distributions, then throws away the standard deviations. I'm asking if Stan could save those values instead, like it does with the means.

Edited to add: In case this wasn't clear, I'm asking because finite-size Monte Carlo estimates of the standard deviations are slower and less accurate. Also, if the standard deviations were retained, then it would be very easy to collect additional samples from a fitted model.

bob-carpenter commented 6 years ago

Yes, the algorithm could return unconstrained standard deviations.

Nothing in any of our interfaces returns unconstrained parameters at the moment, but that doesn't mean we can't add it.

On Jun 1, 2018, at 10:24 AM, David J. Harris notifications@github.com wrote:

Thanks for your quick response. I don't think I was clear. The parameters I'm interested in happen to be unconstrained, so I think that my question is scale-independent.

My understanding is that Stan optimizes the means and standard deviations, then samples from the resulting distributions, then throws away the standard deviations. I'm asking if Stan could save those values instead, like it does with the means.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.

davharris commented 6 years ago

Okay, that makes sense. It sounds like it might be a bigger change than I’d realized. Thanks for your perspective on this.

On Jun 1, 2018, at 10:56 AM, Bob Carpenter notifications@github.com wrote:

Yes, the algorithm could return unconstrained standard deviations.

Nothing in any of our interfaces returns unconstrained parameters at the moment, but that doesn't mean we can't add it.

On Jun 1, 2018, at 10:24 AM, David J. Harris notifications@github.com wrote:

Thanks for your quick response. I don't think I was clear. The parameters I'm interested in happen to be unconstrained, so I think that my question is scale-independent.

My understanding is that Stan optimizes the means and standard deviations, then samples from the resulting distributions, then throws away the standard deviations. I'm asking if Stan could save those values instead, like it does with the means.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/1786#issuecomment-393906311, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzdCTjWGwo6fkk1vXP5j0YEgTN5d6XZks5t4VYWgaJpZM4Hozn1.