Closed barrettk closed 5 months ago
Comments from meeting with @curtisKJ, @timwaterhouse, and @seth127:
THETA
bounds: dont mess with bounds at all. If the new value falls outside of one of the bounds, we are leaning towards setting the initial estimate to the value of the bound.
OMEGA
records) may need additional handling. It was mentioned we should ensure positive definite values (positive determinant). We should investigate if PsN does anything like this.FIXED
values should be left alone. Mention this in the documentation, but dont return messages or warnings when this happensTHETA
records? THETA
records. In these cases, the percent may actually have a larger impact than intended/assumed. Determining if the value is in the log scale would be difficult, and would require both A) parsing the PK
records block to look for exp()
occurrences, and B) mapping it back to THETA
records (cc @kyleam). This would likely be a lot of work. Curtis suggested this may be overcomplicating things, but I wanted to mention this discussion just in case it was worth exploring.Thanks for the write-up @barrettk, and thanks very much for your input @timwaterhouse and @curtisKJ. This all sounds good to me, and it answers most of the questions we had above. A few comments from me:
I think the consensus is, ideally, to allow user to control which blocks to update, similar to the inherit_param_estimates()
interface.
That said, if jittering the matrix records get complicated (see next comment), we may roll out an initial version of this that only jitters the THETA
estimates.
Matrix Records (e.g., OMEGA records) may need additional handling. It was mentioned we should ensure positive definite values (positive determinant). We should investigate if PsN does anything like this.
That's interesting. We will definitely want to look into this and keep getting scientist input along the way.
In terms of how PsN does it, some research is definitely warranted, but step 1 is confirming whether it even touches these matrices.
Do we want tweak values using a single percent, or a different sampled percent for each occurrence? We are leaning towards the latter.
I'm not 100% sure I follow this comment, but I think the runif(length(thetas), -.p, .p)
part of the example code at the top (which is the current intent) is describing the latter approach.
At the end Curtis brought up the consideration of log-scaled THETA records. In these cases, the percent may actually have a larger impact than intended/assumed.
That's an interesting point, but I think we're probably not going to do anything with that (in terms of checking whether certain values are log-scale), at least on our first release. Maybe something to consider for future improvements, especially if folks notice this becoming an issue.
this kind of function could be useful for bootstrap calls
I'm interested in the bootstrap discussion as well, though I don't immediately follow the connection. But, as Kyle mentions, that would be a discussion for another day/issue.
Background
Some Background Discussion
Sample code for how we propose updating the estimates is shown below, though note that we will be parsing the control stream file rather than pulling parameter estimates from a model summary of a previously executed model.
Details on how PsN/Pirana uses this feature:
Feedback requested
The overall procedure for doing this is somewhat simple, though there remain a few outstanding questions that require scientist feedback:
THETA
values?FIXED
parameters?inherit_param_estimates
, do we want the option to only tweakTHETA
records for instance (i.e. ignoreOMEGA
andSIGMA
records)?