This issue to document our discussions on whether and/or how we should truncate a non-regression parameter. The default behavior is that we truncate unless the parameter specification come entirely from default, for which we had already considered the boundaries and chosen some bounded distributions (there might be exceptions to this too). For example, if a user specifies a normal distribution for parameter v with bounds (-5, 5), we will truncate this normal distribution. However, if the user does not make any specifications, then we do not do the truncation.
By truncating the distributions, we meant to provide an additional safeguard to ensure that the samples do not go out of bounds. We already check post-likelihood calculation whether a parameter is within specified bounds, and if not, discard the sample by setting the likelihood to a very small value.
The issue with this practice, however, is the mechanism through which the truncation is done. The only way to specify a truncated distribution through bambi is to wrap a pm.Truncated() distribution within a function, and assign this function to the dist parameter to bmb.Prior. We also created a hssm.Prior class, which is a subclass of bmb.Prior that does this if bounds is provided to the class. This mechanism solves the issue of representing a truncated prior in Bambi, but creates the following problems:
Consistency in scaling. bmb.Prior has an auto_scale parameter, which is by default True, through which Bambi could apply some scaling to some of the parameters. Due to the way truncated priors are wrapped in functions, the same auto-scaling might not happen if we truncate the priors. As a consequence, if a user passes the same parameter specification with bmb.Prior and hssm.Prior without specifying auto_scale, different scaling strategies might be applied, but the user won't realize that the two specifications that look the same might be slightly different.
Opacity in parameter specification. It's difficult to look into the function wrapper passed to dist parameter, so we might not know what actually is specified in the models built. The model print out rely entirely on information provided to the hssm.Prior class, which may or may not actually represent the parameter specification after truncation. If there is anything wrong there, it's not possible to know.
The rules for when to or not to truncate are not made explicit, and this is never made clear to the users.
Advanced users might already consider the possible ranges of some parameters and provide specifications that are already reasonable, which makes the added safeguard unnecessary.
I think this issue boils down to whether this additional safeguard justifies this added layers of opacity to the users and complexity in implementation and maintanence. If we want to proceed with this, we need clearly documented, explicit rules about when truncations are applied and how we communicate to the users that modifications to their specifications have happened.
This issue to document our discussions on whether and/or how we should truncate a non-regression parameter. The default behavior is that we truncate unless the parameter specification come entirely from default, for which we had already considered the boundaries and chosen some bounded distributions (there might be exceptions to this too). For example, if a user specifies a normal distribution for parameter
v
with bounds(-5, 5)
, we will truncate this normal distribution. However, if the user does not make any specifications, then we do not do the truncation.By truncating the distributions, we meant to provide an additional safeguard to ensure that the samples do not go out of bounds. We already check post-likelihood calculation whether a parameter is within specified bounds, and if not, discard the sample by setting the likelihood to a very small value.
The issue with this practice, however, is the mechanism through which the truncation is done. The only way to specify a truncated distribution through bambi is to wrap a
pm.Truncated()
distribution within a function, and assign this function to thedist
parameter tobmb.Prior
. We also created ahssm.Prior
class, which is a subclass ofbmb.Prior
that does this ifbounds
is provided to the class. This mechanism solves the issue of representing a truncated prior in Bambi, but creates the following problems:bmb.Prior
has anauto_scale
parameter, which is by defaultTrue
, through which Bambi could apply some scaling to some of the parameters. Due to the way truncated priors are wrapped in functions, the same auto-scaling might not happen if we truncate the priors. As a consequence, if a user passes the same parameter specification withbmb.Prior
andhssm.Prior
without specifyingauto_scale
, different scaling strategies might be applied, but the user won't realize that the two specifications that look the same might be slightly different.dist
parameter, so we might not know what actually is specified in the models built. The model print out rely entirely on information provided to thehssm.Prior
class, which may or may not actually represent the parameter specification after truncation. If there is anything wrong there, it's not possible to know.I think this issue boils down to whether this additional safeguard justifies this added layers of opacity to the users and complexity in implementation and maintanence. If we want to proceed with this, we need clearly documented, explicit rules about when truncations are applied and how we communicate to the users that modifications to their specifications have happened.