Open andrjohns opened 1 year ago
I’m not sure why these methods exist in the generated model to be honest, since nothing currently seems to use them.
Do you have a use in mind for them?
I was after a way of getting the dimensions of the unconstrained parameters, and get_unconstrained_sizedtypes()
looks like the only option
Another option could be to parse the output of unconstrained_param_names()
, but also open to other suggestions if there's something obvious I'm missing!
I was after a way of getting the dimensions of the unconstrained parameters
What's the eventual goal?
I missed when those functions got added and would not have approved. I don't see the point of get_constrained_sizetypes()
(or the prefix "get_"), as it just returns a string encoding name/size pairs. It'd be better to return a pair consisting of a std::vector<std::string>
of strings and a std::vector<size_t>
of sizes.
They were added back in 2019 but never exposed or used anywhere: https://github.com/stan-dev/stanc3/pull/226
Using std::vector<size_t>
for objects is more or less singlehandedly holding up the tuples PR at this point, so avoiding using that is a win in my eyes.
The JSON these return is much richer anyway since it lets you distinguish matrix[3,2]
and array[3,2] real
, for example, and is probably the part of the compiler which was most ready for tuples/non-rectangular types, and would probably be useful for cmdstanpy/r when it comes time to process CSV outputs which contain tuples
What's the eventual goal?
I'm updating the handling of unconstrained pars in cmdstanr
to be structured using the draws
formats from posterior
(motivation in this issue).
For that I need to know their dimensions - given that they can differ from the constrained parameter dimensions returned by get_dims()
I wrote back on the original issue asking why. I don't understand the motivation and have the same concerns as @mitzimorris expressed about the proposed solution.
I've always found the R devs to have almost diametrically opposed preferences to mine, which is for minimal but composable (i.e., unix-like) APIs. For example, I prefer the simple direct manipulations of NumPy to the do-it-all functions of the Tidyverse.
I wrote back on the original issue asking why. I don't understand the motivation and have the same concerns as @mitzimorris expressed about the proposed solution.
Note that the proposed solution from the linked post (a named list) has since been eschewed in favour of the draws
formats - which is what the results of sampling/optimisation/etc (i.e., the constrained parameters) currently use. In other words, it's aligning the APIs for the constrained and unconstrained parametetrs.
At a higher level, the motivation is that there are multiple use-cases which need to operate on the unconstrained parameters (e.g., moment-matching, power-scaling, plotting). Users/packages currently need to manually track the indexes of particular parameters in the single unlabeled numeric vector of values when they're performing operations or constructing their own unconstrained parameter inputs - not very user-friendly or robust to error.
Additionally, the current single numeric vector structure isn't terribly amenable to a workflow operating on multiple iterations across multiple chains.
Given that we have an existing API in the draws
format for operating on parameters from multiple chains, as well as access to the model functions for unconstrained parameter dimensions, it seems like a good quality-of-life improvement for users
And also to clarify that the proposed solution in the linked post was using the existing structure for specifying initial values that is used by both rstan
and cmdstanr
Summary:
stanc
outputs methods for returning a json-formatted string of the parameter metadata, like the following for the bernoulli model:But these methods don't have corresponding virtual methods in
src/model/model_base.hpp
Current Version:
v2.31.0