Open pavanramkumar opened 4 years ago
I wouldn't change the class signature too much any more. We made one big backwards incompatible change with this release (GLM and GLMCV). Now we should try to keep the API a little stable. Not sure I understand the second approach fully. In the first approach, you would not change anything in the GLM object except allow also objects that subclass from BaseDistribution
. Users don't have to create them as you can just provide them in a module.
from pyglmnet.distributions import NegativeBinomial
distr = NegativeBinomial(n_failure)
then inside the GLM
class you'd say:
if distr == 'negative_binomial':
self.distr = NegativeBinomial()
elif distr == 'poisson':
self.distr = Poisson()
etc. The subclassing simply ensures that all the distributions follow the same pattern with the same methods that need to be overridden.
Motivation: as we expand the scope of models supported, it may be easier for contributors to add new distributions if we refactor the class to accept distribution-specific params and corresponding likelihood / loss / gradients / hessians as callables.
One strategy inspired by
statsmodels
proposed by @jasmainak in #274 is as follows:There are also alternatives such as the
scikit-learn
approach which is less work for the low touch user who only needs to callfit
andpredict
on a model object without having to construct adistr
object: