paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.28k stars 187 forks source link

zero-inflated asymmetric laplace distribution #723

Closed pauljeco closed 10 months ago

pauljeco commented 5 years ago

Many data sets e.g. rainfall data contain numerous zero's. By adding this functionality this would open a whole new dimension to analysing climate and other data. zero-inflated asymmetric laplace distribution is specifically mentioned to the potential for quantile regression. This would provide mechanisms to making sense of the inherent variance in time series climate data - I believe this would be useful to many climate scientists.

HaydenTWilson commented 5 years ago

Thats a great idea. Please do. this is exactly what we need.

paul-buerkner commented 5 years ago

Has someone written about this distribution yet? Do we understand its properties?

pauljeco commented 5 years ago

Hi Paul, Thanks for your quick response. I'm not aware of documentation of the 'zero-inflated asymmetric laplace distribution'. The alternative quantile regression probability distribution for climate data with many zero's may be through implementation of quantile functionality with your hurdle_gamma distribution? Although I'm not sure of the technicalities in implementing this it would certainly be useful.

paul-buerkner commented 5 years ago

I am wondering whether it could make sense to work on a more methodological paper together on the zero-inflated asymmetric laplace distribution (given that noone has worked on it before, as I understand from your response) before applying it in practice. The reason is that the main parameter of the asym laplace distribution may no longer describe a given quantile once we introduce zero-inflation but this is something that needs to be worked out. Perhaps you (or someone else) has interest in leading such an endavor. I would gladly help with this but don't have the time to take the lead.

pauljeco commented 5 years ago

Thanks Paul. If you were able to write the function for stan in brms I would be grateful to run some tests with it using a long-term rainfall data set? A methodological paper would be a good outcome.

paul-buerkner commented 5 years ago

That could be a start. I will post an update once the brms family is ready.

pauljeco commented 5 years ago

Thanks Paul - looking forward to this - much appreciated

paul-buerkner commented 5 years ago

What I do wonder is why is the asymmetric laplace distribution used for rainfall data at all?

To my understanding, rainfall can only be non-negative while the asymmetric laplace distribution supports both positive and negative values. Having a zero-inflation "in the middle" of a distribution is something I have yet to see in real data.

pauljeco commented 5 years ago

Thanks Paul. Possibly because of this a tweedie has used more often. For the zero-inflation a compound 'poisson-gamma' (doi.org/10.1155/2018/1012647) appears to be a good approach - from my limited experience, the math behind this is beyond me. This would be great to implement however, my interest was in quantile regression - changes in central tendency or the median are non-significant so it is more interesting to see what is happening with different quantiles. I had a really rudimentary analysis like this published (frequency of events - count data doi.org/10.1016/j.ppees.2012.09.005) but would like a better approach, if possible. Otherwise, I can perform multiple zero-inflated poisson models for classes of event frequency.

Have you had any other interest in implementing quantile regression for the (hurdle) gamma distribution? Not sure if this is possible though?

paul-buerkner commented 5 years ago

Quantile regression cannot just be enabled for an arbitrary family. Instead, the key feature of the asymmetric laplace distribution with parameters mu, sigma, and p (the quantile parameter) is that P(X < mu) = p, which makes the distribution usable for quantile regression in the first place. For a zero-inflated asymmetric laplace distribution to be useful, we have to preserve this key property or otherwise we won't be able to use it for quantile regression anymore.

Let's have a skype call about this issue to make sure we are on the same page and to better understand whether the zero-inflated asymmetric laplace distribution is a promising approach worth spending more time on. Do you mind writing me an email about this? I can't see your email address right now.

pauljeco commented 5 years ago

Have dropped you an email using this ticket header

paul-buerkner commented 5 years ago

Thank you! What time would work for you? For me start of next week would be ideal but I may also manage other times. In which Time zone are you?

pauljeco notifications@github.com schrieb am Do., 8. Aug. 2019, 11:04:

Have dropped you an email using this ticket header

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/723?email_source=notifications&email_token=ADCW2AELA6MMUZV2GLKVX4LQDPORXA5CNFSM4IJTV7DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD327DEY#issuecomment-519434643, or mute the thread https://github.com/notifications/unsubscribe-auth/ADCW2ADYC2GYGHRXAMLIDELQDPORXANCNFSM4IJTV7DA .

paul-buerkner commented 5 years ago

Was ment to be send via email. Sorry :-D

Paul Buerkner paul.buerkner@gmail.com schrieb am Do., 8. Aug. 2019, 11:13:

Thank you! What time would work for you? For me start of next week would be ideal but I may also manage other times. In which Time zone are you?

pauljeco notifications@github.com schrieb am Do., 8. Aug. 2019, 11:04:

Have dropped you an email using this ticket header

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/723?email_source=notifications&email_token=ADCW2AELA6MMUZV2GLKVX4LQDPORXA5CNFSM4IJTV7DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD327DEY#issuecomment-519434643, or mute the thread https://github.com/notifications/unsubscribe-auth/ADCW2ADYC2GYGHRXAMLIDELQDPORXANCNFSM4IJTV7DA .

paul-buerkner commented 5 years ago

An experimental version of the zero_inflated_asym_laplace distribution is now available in the github version of brms. Please note that this family is not yet documented or offically supported as we still need to get a better understanding of the properties of this distribution (and whether its generally useful).

Here is an example how to use it:

y <- rasym_laplace(1000, mu = 5)
y <- c(y, rep(0, 100))
dat <- data.frame(y, x = rnorm(1100))

fit <- brm(y ~ x, data = dat, 
           family = brmsfamily("zero_inflated_asym_laplace"))

pp_check(fit)
marginal_effects(fit)
loo(fit)
michaelkyei66 commented 4 years ago

Hello everyone,

I am trying to model a continous outcome variable(non-integer) with excess zeros.I have four independent variables to predict the outcome variable. I have seen that people have referred to the compound poisson gamma model( https://doi.org/10.1111/2041-210X.12122) as one of the possible models to deal with such data.My questions are ;

  1. What model will be most apprioprate for my data? 2.Do you have a similar r scripts that I could follow as an example?

Best wishes,

Michael Kyei

paul-buerkner commented 4 years ago

please ask brms related questions on https://discourse.mc-stan.org/