Open Expertium opened 2 weeks ago
I find out a case which may help us to clarify the problem:
rating | stability | count |
---|---|---|
again | 0.47 | 65 |
hard | 0.27 | 8587 |
good | 1.13 | 15614 |
easy | 100.0 | 4348 |
Intuitively, we need to decrease the stability of again
. And the degree of decrement should depend on the sample size.
I found a problem with the current way you estimate missing values of S0
import numpy as np
w1 = 3/5
w2 = 3/5
rating_stability = {1: 0.47, 2: 0.27, 3: 1.13, 4: 100.0}
def impute(rating_stability):
S0_1 = np.power(rating_stability[2], 1 / w1) * np.power(rating_stability[3], 1 - 1 / w1)
S0_2 = np.power(rating_stability[1], w1) * np.power(rating_stability[3], 1 - w1)
S0_3 = np.power(rating_stability[2], 1 - w2) * np.power(rating_stability[4], w2)
S0_4 = np.power(rating_stability[2], 1 - 1 / w2) * np.power(rating_stability[3], 1 / w2)
new_rating_stability = {1: round(S0_1, 4), 2: round(S0_2, 4), 3: round(S0_3, 4), 4: round(S0_4, 4)}
return new_rating_stability
new_rating_stability = impute(rating_stability)
print(new_rating_stability)
The result is {1: 0.104, 2: 0.6676, 3: 9.3874, 4: 2.9346}, which is non-monotonic. I had an idea related to using these estimates, but it seems that your method needs to be revised.
smooth_and_fill
isn't strict enough. We need Again < Hard < Good < Easy, not just Again <= Hard <= Good <= Easy. This is so that the user is less likely to see the same interval if learning steps are removed.