Add `diagnose` method to `CmdStanModel`.

stan-dev / cmdstanpy

CmdStanPy is a lightweight interface to Stan for Python users which provides the necessary objects and functions to compile a Stan program and fit the model to data using CmdStan.

BSD 3-Clause "New" or "Revised" License

149 stars 67 forks source link

Add `diagnose` method to `CmdStanModel`. #734

Open tillahoffmann opened 5 months ago

tillahoffmann commented 5 months ago

Submission Checklist

[x] Run unit tests
[x] Declare copyright holder and open-source license: see below

Summary

This PR adds the diagnose method to CmdStanModel which can be used to verify gradients, e.g., explicit gradients for custom functions implemented in C++. Documentation is based on the equivalent cmdstanr function and other methods, such as CmdStanModel.variational.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Harvard University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)

codecov-commenter commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 79.76%. Comparing base (dd95ba7) to head (8f3d3b4). Report is 1 commits behind head on develop.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## develop #734 +/- ## =========================================== - Coverage 80.23% 79.76% -0.48% =========================================== Files 25 25 Lines 3845 3879 +34 =========================================== + Hits 3085 3094 +9 - Misses 760 785 +25 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

WardBrian commented 5 months ago

How would you feel about modeling this more after the log_prob method on CmdStanModel? Namely, not needing its own Args class or a holder for the return type, but just giving you back the table as a pandas dataframe and that's it.

tillahoffmann commented 5 months ago

Good point, updated. Only caveat is that if required_gradients_ok is False, there is no way to check if the gradients passed the checks. But this is likely to be used only interactively anyway so probably not a problem.

tillahoffmann commented 5 months ago

Thanks for the comments, I'll have a look. As an aside, do we mind how close the interfaces of cmdstanr and cmdstanpy are? E.g., cmdstanr uses the object-oriented approach for diagnose (see here). cmdstanr doesn't seem to implement log_prob (but I haven't dug too deep into the docs).

WardBrian commented 5 months ago

I believe they have support for log_prob through some optional RCPP stuff, which is pretty far outside anything we’d want to do in Python.

We try to be consistent-ish for big things like the names and default values of the arguments to sample, but I don’t think it makes sense to aim for complete API matching, especially for smaller things like this