paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.27k stars 182 forks source link

Allow regex for coef in `set_prior` #724

Closed cfhammill closed 5 years ago

cfhammill commented 5 years ago

I have a model with an interaction term y ~ effect * group, where I have several hundred groups, I'd like to put a prior on all terms of the form effect:group, but presently I would have to build a prior object with one element per group.

A coef_regex argument could fix this, I'm happy to try to implement and submit a PR if this would be a useful feature for others.

paul-buerkner commented 5 years ago

set_prior is vectorized. If you can provide a vector with all coefficient names on which you want to set priors, lets call it coef_names, then you can use

set_prior(..., coef = coef_names)

Is this already sufficient for your needs?

cfhammill commented 5 years ago

yup, that works, although I will still need to extract coef_names with a regex.

paul-buerkner commented 5 years ago

Yes, but this could happen outside of brms and there is no need to add functionality for this purpose. The problem with such a feature would be that set_prior does not know about the model and so cannot solve the regex until much later on. I am hesitant to implement this due to the expected non-negliable code overhead. So my recommend workflow would be: (1) Find the correct coef_names with whatever strategy you like, (2) pass them to set_prior.

cfhammill commented 5 years ago

I did this in my code already using get_prior on the formula and data, it's a bit repetitive but not too bad. Do you have any other suggestions for a clean way to get the coefficient list?

And good point re carrying the regex around unmatched, I'm not sure how that would backfire but it might. I did have a look at how I would implement the feature and it didn't seem like it would add that much code, but I could be wrong.

paul-buerkner commented 5 years ago

get_prior is indeed the canonical way to extract coefficient names.

I will take a closer look at what would be a clean way to allow coef to be a regex, but in any case I would appreciate a proposal for a good user API for this feature.

cfhammill commented 5 years ago

In what form do you want the proposal? On the discourse, as a comment on this issue, or as a PR. Happy to do whichever, although in some ways I think using the non-linear formula trick from #621 is a better solution - force the user to construct a term containing just the coefs they want to be under a specific prior and use class and nlpar to set for the whole group.

paul-buerkner commented 5 years ago

In the form of hypothetical code of how users would pass the relevant information. For instance, we cannot simply use

set_prior(..., coef = "<regex>")

as brms would have no way of knowing that the information passed should be interpreted as a regex. Let me be clear, I am sceptical of this feature but I may be convinced otherwise by a good proposal for a user API.

I agree that the non-linear way of doing it may a cleaner solution anyway.

cfhammill commented 5 years ago

My suggestion would be to add an argument to set_prior

r set_prior(..., coef_regex = "<regex>")

such that

r set_prior(..., coef_regex = "<regex>", coef = "<coef>)"

produces an error indicating that coef and coef_regex cannot be used together.

Then dispatch the normal logic if coef_regex isn't passed, and have new logic for if it is. I suspect it's a few lines of code in set_prior, .set_prior, brmsprior, with the bulk of the logic in check_prior

paul-buerkner commented 5 years ago

Yes, I agree, thanks. I will think of whether I do want to make this change. After all, each feature is something I need to maintain in the end and I think the added benefit of this may be minor anyway. I will come back to you at some point next week with a decision.

cfhammill commented 5 years ago

Sounds good to me, thanks! I think we're both in Helsinki next week anyway, maybe we can chat over coffee.

paul-buerkner commented 5 years ago

great! Just write me an email to schedule a time for coffee :-)

paul-buerkner commented 5 years ago

After giving it some more thought I think that this convenience feature is not worth the structural changes required to brmsprior objects and the required code overhead to solve the regex at the right places. Also, it is not entirely clear to me how updating existing prior objects should work when using regex coefficients. Closing this issue.