Closed poypoyan closed 2 years ago
There seems to be 2 ways: 1) Parametric estimators (e.g. Maximum Likelihood Estimation (MLE)) Sample: https://www.statsmodels.org/devel/dev/generated/statsmodels.base.model.GenericLikelihoodModel.html Pros: If correct prob. distribution is chosen, estimate has minimum error (see Cramér–Rao lower bound). Cons: You have to guess the "correct" prob. distribution. If incorrect prob. distribution is chosen, very inaccurate.
2) Non-parametric estimators (e.g. Kernel Density Estination (KDE)) Sample: https://github.com/tommyod/KDEpy Pros: You don't have to guess the "correct" prob. distribution. Cons: Less interpretable than MLE. May need more sample size than MLE.
I need to study this more. Closing this for now.
Currently, the duration probabilities per state are stored in a 2D array of shape (n_states, n_durations). There are situations wherein aside from these "non-parametric" duration PMFs per state, the duration needs to be estimated by parametric distributions.
The 'hsmm' R package offers 4 parametric distributions for duration:
I need help on the math behind determining the parameters from non-parametric duration PMF, especially these 4 distributions. Suggest some resources. I prefer clear algorithms, but anything relevant is welcome.
Thanks!