metrumresearchgroup / bbr

R interface for model and project management
https://metrumresearchgroup.github.io/bbr/
Other
22 stars 2 forks source link

New Feature: jitter/tweak initial parameter estimates #632

Closed barrettk closed 5 months ago

barrettk commented 6 months ago

Background

Some Background Discussion

Sample code for how we propose updating the estimates is shown below, though note that we will be parsing the control stream file rather than pulling parameter estimates from a model summary of a previously executed model.

thetas <- get_theta(model_summary(mod106))

# user input, how much to tweak by
.p <- 0.1

# actual percentage to tweak each one by
tweak_perc <- runif(length(thetas), -.p, .p)

# actual tweak is made (this is how PsN works)
thetas <- thetas + (thetas * tweak_perc)

Details on how PsN/Pirana uses this feature:

 -degree = fraction
            Default not set. After updating the initial estimates in the output file,
            randomly perturb them by degree=fraction, i.e. change estimate to
            a value randomly chosen in the range estimate +/- estimate*fraction
            while respecting upper and lower boundaries, if set in the model file.
            Degree is set to 0.1, a 10% perturbation, when option tweak_inits is
            set in execute. EOF

Feedback requested

The overall procedure for doing this is somewhat simple, though there remain a few outstanding questions that require scientist feedback:

barrettk commented 6 months ago

Comments from meeting with @curtisKJ, @timwaterhouse, and @seth127:

seth127 commented 6 months ago

Thanks for the write-up @barrettk, and thanks very much for your input @timwaterhouse and @curtisKJ. This all sounds good to me, and it answers most of the questions we had above. A few comments from me:


I think the consensus is, ideally, to allow user to control which blocks to update, similar to the inherit_param_estimates() interface.

That said, if jittering the matrix records get complicated (see next comment), we may roll out an initial version of this that only jitters the THETA estimates.


Matrix Records (e.g., OMEGA records) may need additional handling. It was mentioned we should ensure positive definite values (positive determinant). We should investigate if PsN does anything like this.

That's interesting. We will definitely want to look into this and keep getting scientist input along the way.

In terms of how PsN does it, some research is definitely warranted, but step 1 is confirming whether it even touches these matrices.


Do we want tweak values using a single percent, or a different sampled percent for each occurrence? We are leaning towards the latter.

I'm not 100% sure I follow this comment, but I think the runif(length(thetas), -.p, .p) part of the example code at the top (which is the current intent) is describing the latter approach.


At the end Curtis brought up the consideration of log-scaled THETA records. In these cases, the percent may actually have a larger impact than intended/assumed.

That's an interesting point, but I think we're probably not going to do anything with that (in terms of checking whether certain values are log-scale), at least on our first release. Maybe something to consider for future improvements, especially if folks notice this becoming an issue.


this kind of function could be useful for bootstrap calls

I'm interested in the bootstrap discussion as well, though I don't immediately follow the connection. But, as Kyle mentions, that would be a discussion for another day/issue.