Closed tezer closed 4 years ago
Yes, this is too much of a surprise for the algorithm to handle 😢. I noted this in the docstring for updateRecall
:
N.B. This function is tested for numerical stability for small `total < 5`. It
may be unstable for much larger `total`.
N.B.2. This function may throw an assertion error upon numerical instability.
This can happen if the algorithm is *extremely* surprised by a result; for
example, if `successes=0` and `total=5` (complete failure) when `tnow` is very
small compared to the halflife encoded in `prior`. Calling functions are asked
to call this inside a try-except block and to handle any possible
`AssertionError`s in a manner consistent with user expectations, for example,
by faking a more reasonable `tnow`. Please open an issue if you encounter such
exceptions for cases that you think are reasonable.
Are you using such large totals to try and approximate a fractional recall strength? Unfortunately that's probably not going to be possible due to these numerical issues. I recommend using total<=3
, even if your quiz really does have more trials than this.
This is unfortunately the best I could do with 64-bit floating point math (though likely there are improvements I don't know about). While I could use arbitrary-precision numbers via the very nice mpmath
library, I hesitate to do this because most other languages do not support arbitrary-precision arithmetic or special functions.
I'll close this, feel free to reopen if you have further questions or if I'm not being clear!
I noticed in your stacktrace that tnow=0.1
and tback=4.0
(the last element of the model
) and thought to experiment with explicitly setting the tback=tnow
: this argument tells the update function what time horizon you want the updated model to be for.
So:
In [13]: ebisu.updateRecall((3.0, 3.0, 4.0), 1, 10, .1, rebalance=False, tback=.1)
Out[13]: (1.9194897423560657, 0.43238986987746275, 0.1)
In [14]: ebisu.updateRecall((3.0, 3.0, 4.0), 1, 10, .1, rebalance=True, tback=.1)
Out[14]: (0.8319919171517688, 0.6407240193877811, 0.833209151552077)
This seems to work because setting tback=tnow
lets the algorithm work with smaller arguments to the betaln
function, preventing overflow. In the first example above, I set rebalance=False
, which will ask the updater to return the updated model at a new time of tback
, just to confirm that it wouldn't throw an exception. The second example, with the rebalance=True
, allows it to move the updated model time to something closer to the model halflife.
But note that in both updated models, the first or the first two parameters of the model (alpha and beta) have gone < 1, which lead to improper Beta distributions (where the probability distribution becomes U-shaped (bimodal) instead of upside-down-U shaped (unimodal)). Trying to use these models will probably result in future incorrect behavior 😔. For example, the two models disagree in half-life:
In [15]: ebisu.modelToPercentileDecay(ebisu.updateRecall((3.0, 3.0, 4.0), 1, 10, .1, rebalance=False, tback=.1))
Out[15]: 0.6584816172123262
In [16]: ebisu.modelToPercentileDecay(ebisu.updateRecall((3.0, 3.0, 4.0), 1, 10, .1, rebalance=True, tback=.1))
Out[16]: 1.1267087826740922
In contrast, using more reasonable total=3
gives models that are truly equivalent after rebalancing: I see the same halflives here:
In [19]: ebisu.modelToPercentileDecay(ebisu.updateRecall((3.0, 3.0, 4.0), 1, 3, .1, rebalance=False, tback=.1))
Out[19]: 2.2700374362921734
In [20]: ebisu.modelToPercentileDecay(ebisu.updateRecall((3.0, 3.0, 4.0), 1, 3, .1, rebalance=True, tback=.1))
Out[20]: 2.269294323549595
So in this situation, even using arbitrary-precision arithmetic won't help—the algorithm is so surprised by the student failing nine times out of ten, only 0.1 time units after the last review, when it's model expected a half-life of 4 time units, that it goes into an incorrect part of the parameter space 😤. There's unfortunately no fix for this—at best I may be able to detect when this happens and throw a ValueError (instead of AssertionError on numerical overflow/underflow) but you'd have to handle that.
Before Ebisu 2.0, total=1
, i.e., Ebisu only dealt with binary quizzes ("Bernoulli experiments" is the statistical term). It was relatively straightforward to extend it to total>1
("binomial experiments"), so I added it. The quiz model for binomial quizzes is that you have quizzed the user total
times in a single quiz session, without giving them feedback, so their performance on each quiz is statistically independent of the others. Is this the case you are using? Or are you trying to use success
and total
to approximate a 'quiz strength'?
I ask because we don't have a lot of practical experience with this Ebisu 2.0 binomial quiz style (total>1
) and your feedback would be most helpful.
Again, please feel free to follow up with questions, I'm not sure how familiar you are with Ebisu or the underlying statistics so the above explanation might have been too opaque, I'm happy to elaborate.
Thank you for the elaborate answer! Really appreciate your work and attitude. I am just exploring Ebisu and am panning to use it in one of my projects, that personalize learning.
Hi I have an error message, when I use a certain combination of parameters:
It looks like it fails when the difference between successes and total is > 7: it crashed with successes=0 and total=8 and successes=1 and total=10 it all happens at tnow=0.1
If I increase tnow, the limit goes up: at tnow=0.2 it crashes at successes = 0, and total=11 at tnow = 10.2, at successes = 0, and total=64, but runs ok if successes = 2, and total=64