Open brishtibheja opened 4 weeks ago
One initial idea to solve this might be by trying to incorporate D in the formula for forgetting curve. Has this been tried before in some way?
Nope. It would complicate things a lot. It's not compatible with how the initial S is estimated, it would screw up the "A 100 day interval will become x days." thingy and it would make stability less interpretable.
R should only depend on S and time elapsed. D affects R indirectly by affecting S.
So, the actual issue here is that S is not calculated accurately for high and low D values.
I'm not sure if this is actually a problem, or just the algorithm tempering its speed that it will adapt to cards that are subjectively very easy or very hard. You don't want to overfit the model. It might end up being better that it doesn't "jump to conclusions" as it adjusts the difficulty. The variance of subjective difficulty is so broad, that there's always going to be cards that are scheduled incorrectly, but those are the cards FSRS will learn the most information from each time.
That being said, if it is a problem I agree with @user1823.
So, the actual issue here is that S is not calculated accurately for high and low D values.
I'm guessing you'd want to change how D is handled in the Stability equation.
I'm guessing you'd want to change how D is handled in the Stability equation.
Tried that too. I couldn't find anything that improved the results.
I was just thinking because I often have backlogs, maybe that affected the stats for high difficult cards. It's possible. But, still that wouldn't mean cards in the prop:d<0.9 prop:d>0.8
range will look like this:
I am going through a backlog of 2k cards this month which I got from rescheduling. My DR is set to .85
.
I think we should make RMSE (bins) create the bins according to difficulty too. I gave a quick look at how it's done and didn't seem to find anything like this.
I think we should make RMSE (bins) create the bins according to difficulty too. I gave a quick look at how it's done and didn't seem to find anything like this.
We can't make the binning method depend on D, S or R (well, IIRC we can, it's just painfully slow). The binning depends on: 1) Total number of reviews 2) Interval length 3) Number of lapses (Agains)
I'd say that the number of lapses is a good proxy for D
I'd say that the number of lapses is a good proxy for D
Yes, you might be right. So we should see metrics improving if R prediction is improved for low/high D cards.
Actually, why is it not something like Pass/Fail ratio though? That sounds better to me.
R should only depend on S and time elapsed. D affects R indirectly by affecting S.
@user1823 That cannot achieve all the effects we would possibly want.
Consider cards with a stability of 7d
. Assuming, Anki has correctly assigned the stability, you remember around 90% of the material a week later. Then a month and a half later though, do we expect to remember the highly difficult cards at all. In some disciplines, it's possible you forget almost all the difficult cards a month later but you still retain some of the relatively easier cards. (I think this happens naturally, but also because of reasons like getting more real-life encounters with easier stuff or inherent reviews of easier material while doing other harder anki cards).
I think how you'd make changes that take that into account is by differing the formula for R on the basis of what value D has taken. As D rises, say the curve for R gets steeper and steeper.
Re: making S meaningful
That can be done if all the forgetting curves for the same S intersects at some point. Then, you can possibly still say for the S equals time it takes for R to reach .90
etcetra.
In some disciplines, it's possible you forget almost all the difficult cards a month later but you still retain some of the relatively easier cards.
This is just another way of saying that the stability of some cards is less than the others.
I don't get it. In my example, stability was same because R was really 90% after a week for both the easy and harder cards.
How is it possible that R decreases at the same rate for both cards in the first week but later decreases faster for one of the cards?
If one card is harder, the R should decrease faster in the first week too, which means that its R after one week can't be equal to that of the other card.
It's kind of possible. https://www.desmos.com/calculator/9pbylwb5yu See how R falls much faster if the function is exponential? You may be surprised what happens if we zoom in to the beginning, when S is les than or equal to 1. All three curves are practically indistinguishable.
So yeah, theoretically we could change the shape of the curve based on D, but as I said earlier
It's not compatible with how the initial S is estimated, it would screw up the "A 100 day interval will become x days." thingy and it would make stability less interpretable.
How is it possible that R decreases at the same rate for both cards in the first week but later decreases faster for one of the cards?
I'm sure we'll need evidence for that but for now only experience guides me. E.g. I remember lot of the easy stuff I learned in Spanish but have forgotten almost all the hard stuff though I learned them around the same time.
If one card is harder, the R should decrease faster in the first week too
You failed the easy card and it's S became 1 week. I don't see why the Stability can't be the same. You'd need to start learning the cards at different times.
it would make stability less interpretable.
I think I answered this but I don't have any solution for the other two. Maybe in the pop-up case, we can show an estimate like "intervals will increase by 13%" which would be the average.
E.g. I remember lot of the easy stuff I learned in Spanish but have forgotten almost all the hard stuff though I learned them around the same time.
That just means that stability was different.
You failed the easy card and it's S became 1 week.
I am not talking about the stability given by FSRS. I am saying that the actual S for those cards is different and this is the problem - FSRS is not able to calculate S with 100% accuracy.
theoretically we could change the shape of the curve based on D
Well, long ago (when we were developing FSRS 4), I said that we are introducing a power forgetting curve in FSRS only because we are unable to accurately calculate S. But, forgetting is exponential in nature. So, if we are somehow able to make the calculation of S very accurate, we will start using the exponential forgetting curve again.
If we change the shape of the curve based on D, given the same S, we'll obtain a family of forgetting curves intersecting at the point (S, 90%). However, they are non-overlapping.
Khajah, M. M., Lindsey, R. V., & Mozer, M. C. (2014). Maximizing Students’ Retention via Spaced Review: Practical Guidance From Computational Models of Memory. Topics in Cognitive Science, 6(1), 157–169. https://doi.org/10.1111/tops.12077
Using
difficulty_asc
as a sort order, I noticed having more of my lapses occur at later portions of the review session despite retrievability being the same. In Anki's stats screen, searches likeprop:d>0.9
andprop:d<0.9 prop:d>0.8
and looking at the graphs makes it more evident.I find FSRS consistently underestimating R for more easier cards such that average retention is higher than set DR for cards low in difficulty. This is in contrast with cards in
prop:d>0.9
where retention is almost 10 percentage points below DR. It's erring in both directions.Previously, @richard-lacasse has also reported similar experience.
One initial idea to solve this might be by trying to incorporate D in the formula for forgetting curve. Has this been tried before in some way?