stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.55k stars 365 forks source link

report condition number of metric for HMC adaptive samplers #3280

Open mitzimorris opened 2 months ago

mitzimorris commented 2 months ago

Summary:

Extend the functionality added in https://github.com/stan-dev/stan/pull/3230 to return the Hessian and the condition number of the Hessian at the end of adaptation.

Description:

The Stan User's Guide chapter on efficiency tuning describes what the condition number is, but this is not exposed in the Stan services layer.

Now that it is possible to output the metric in a JSON file, we should be able to provide functionality available in BridgeStan https://github.com/roualdes/bridgestan/blob/36d391a6fc8ac8a8ddc901a66df56e9e1b73fb13/src/bridgestan.cpp#L184-L206 to get the Hessian, compute the Eigenvalues, and get the ratio of largest/smallest.

We should do this for Optimization, ADVI, and Pathfinder as well - Pathfinder's diagnostics files would allow this. The other algorithms would need to add more output writers.

Expected Output:

Additional fields added to output file metric.json.

Current Version:

v2.34.1

mitzimorris commented 2 months ago

Alternatively, we could write a stand-alone utility for CmdStan, which given a sample, computes the condition number - like diagnose or stansummary.

bob-carpenter commented 2 months ago

This would be a nice diagnostic.

How are you proposing to compute Hessians? We can only get autodiff Hessians for our analytical functions---the implicit functions aren't implemented with forward-mode autodiff. We can get finite difference Hessians everywhere, but that's expensive computationally and only gives about half the accuracy (which could be a problem when computing condition).

WardBrian commented 2 months ago

@andrjohns implemented a nice way of doing finite differences only for the functions which don't have full forward mode support: https://github.com/stan-dev/math/pull/2929

This is essentially a "best of both worlds" approach. It looks like only integrate_1d was actually added to the framework, but I see no reason why the remaining implicit functions couldn't all be.

bob-carpenter commented 2 months ago

That PR from @andrjohns is really amazing. That's a great way to handle these cases. I agree that the other ones should be added.