idce data in larch.numba

janzill commented 2 years ago

I've run into an issue with idce data in larch.numba. For a reproducible example based on larch exampville see https://gist.github.com/janzill/a961da02ac80cd961fe4688ab1398e25. When running this several times sometimes the estimation with larch.numba succeeds and sometimes it just produces a negative infinite log-likelihood. Even when it succeeds the values are always different from larch cython (the comparison is part of the example above).

The problem seems to be in the data initialisation step, because once the model parameters are set calculating the log-likelihood leads to identical results, both in terms of initial log-likelihood (i.e. running m.loglike() repeatedly) and the estimated parameters when maximising the log-likelihood. However, initialising a new model instance and loading the data can lead to a different initial log-likelihood and then different estimation solutions. I've observed initial log-likelihoods of nans, -infs, and numerical values varying over several orders of magnitude. Does this hint at the idce sharrow integration not working properly?

jpn-- commented 2 years ago

The sharrow integration is I think mostly working, but there may still be corner cases where the code is incomplete. This is one of them; apparently I missed the IDCE implementation on the quantity terms. You can can see the utility_from_data_ca function has ce terms but the quantity_from_data_ca one does not. This shouldn't be hard to fix. But there may be other corner cases waiting to be discovered, that's why larch.numba is flagged as "experimental" in a warning when you import it.

janzill commented 2 years ago

Ah yes, that makes sense, thanks Jeff

jpn-- / larch

idce data in larch.numba #22