open-spaced-repetition / fsrs4anki

A modern Anki custom scheduling based on Free Spaced Repetition Scheduler algorithm
https://github.com/open-spaced-repetition/fsrs4anki/wiki
MIT License
2.56k stars 127 forks source link

[Feature Request] A year into using FSRS, I’m convinced we could benefit greatly fromseparate retention rates for “learning” and “known” items. #694

Open aedoncassiel opened 2 days ago

aedoncassiel commented 2 days ago

Essentially, I have had a ton of success dropping my desired retention rate far down so I can cram many new items without spending five hours per day on review. Then I can quickly get the items I turned out to have learned very easily far out of the way, and focus all my time in the beginning on the harder ones. The items that turn out to be easy and hard aren't always what I thought they would have been, so this is immensely valuable.

However: let's say I mastered the Japanese kana in a handful of days. Now, even if I keep that deck at 99% retention, this only asks me to review a single kana every few weeks—if that.

If I'm dealing with finite, well defined, separate decks, I can just up the retention on that deck. But what if I'm using a "Japanese" deck and it happens to include both kana and thousands of new kanji I'm learning? I can't set the whole deck to 99%, but setting it lower isn’t ideal for the kana because if I take a break from reading in Japanese, there's no reason not to make sure I review a kana every few weeks and keep kana reading perfectly fresh for practically zero cost.

I'm using a simplified example to get the general point across. In fact, for every one of us, all of our decks have a mixture of "kana" and "kanji": new items we're struggling through, and old items we've learned so well we could probably even put them at 99% retention with no downside.

The use case I have in mind here is particular, to be fair. In most cases most people want to learn something new they'll be using for a set number of years, and are fine with forgetting if they take up a different job or hobby, and so on. In my case, I want to maintain high literary proficiency over my lifespan with multiple new languages, even over periods where I'm not fitting much practice with that language in.

So I see this issue very clearly in my Spanish deck. I have always had, and like having one deck for this language. I am very comfortable with several thousand intermediate, non-cognate words. My optimal retention rate to spend minimum time on the deck in the next month? 70%. In the next six months? 70%. In the next year? 70%.

... and in the next decade? Well, suddenly optimal retention rate rockets up to 90%. I believe this is clear evidence of "mature" items being scheduled out too far because new and mature items do not have the same optimal retention.

Now I have a few options. I can lock this deck, up the retention rate, push every unseen card in the deck to Spanish2, and plan to keep this up every year or so with each language for years until my menu has Spanish 1-20, Japanese 1-20, and so on. Set Spanish2 to 70% and Spanish1 to 90%, and so on with Spanish3 and Spanisg 2 next year. Or, I can just set my one deck to 90% and lose tons of efficiency over-reviewing new items. Or, I can leave it at 70% and get new words down much more efficiently, while then wastefully forgetting too many of them after several more months pass, when a quick review every ~6 months, say, would have sufficed to keep all of them locked perfectly in memory. Of course I can also set a maximum interval, but still lose efficiency as I inevitably over- or under-guess a good baseline and lump every item into this another over-generalized standard.

I think a built-in ability to start at low retention and then raise the retention rate, per item, as the cost of keeping that item at high retention becomes trivial, could potentially be as groundbreaking as FSRS itself is.

brishtibheja commented 2 days ago

L.M.Sherlock tried a system that varies desired retention for each individual card but it didn't go too far.

I think a built-in ability to start at low retention and then raise the retention rate, per item, as the cost of keeping that item at high retention becomes trivial, could potentially be as groundbreaking as FSRS itself is.

Agreed this can be good. But the UI for such a thing will be complex. But also, would not a lot of mature knowledge would already be deeply encoded semantically that you wouldn't need them per se? Not sure on that front.

Expertium commented 2 days ago

Essentially, I have had a ton of success dropping my desired retention rate far down so I can cram many new items without spending five hours per day on review.

Make sure it doesn't go below minimum recommended retention. image

There two issues with your idea.

First, a greater cognitive burden for the user, who will have to configure two different values of desired retention instead of one, and people are already struggling with realizing that desired retention affects interval lengths. I'm not sure how many users know it, my pessimistic estimate would be 50%. In other words, I'd say about 50% of users have no idea that desired retention affects interval lengths. I'm saying this because I've been doing the Anki equivalent of tech support for about a year. Maybe 75% know it, if I'm being optimistic. Even fewer have ever touched "Compute minimum recommended retention" or used different values of desired retention for different presets. I think FSRS should remove options and settings rather than adding them, if we ever want FSRS to be used by anyone who isn't a complete nerd.

Second issue - defining what counts as "mature". It's arbitrary. In Anki a card is considered "mature" if its interval is >=21 days, but why not 20 or 22?

Side note: Jarrett has been working on a special "regime" for FSRS where it doesn't maintain a specific level of desired retention, and instead tries to make the memory stability as high as possible as fast (in terms of time spent on reviews) as possible, but it seems that it doesn't always work.

aedoncassiel commented 2 days ago

Second issue - defining what counts as "mature". It's arbitrary. In Anki a card is considered "mature" if its interval is >=21 days, but why not 20 or 22?

I think, not necessarily. Because we’re basing this off of optimal retention to spend minimal time, which is what makes FSRS so valuable to begin with.

So, if I take any deck with lots of new cards and lots of cards I've seen for months, and I separate those cards in different decks, the calculator is going to tell me (at least in my experience with several decks so far) that the optimal retention of the new cards for the next year is pretty low and the optimal retention for the old cards for the next decade is a lot higher. Remember that for this latter set, the data has actually had time to push these words out ~8 months and then see how many of them I do in fact recall.

So, this shows me that the calculator already knows that the optimal retention to spend minimum time is different for these two sets. I think this is simply because pushing known items out a full year until I forget perhaps 30% of them is very inefficient when perhaps one five second review in the many months prior might even have been enough to keep me at 99% retention for all these items.

Possibly, certainly at least in theory, a more advanced calculator could in and of itself determine what the most effective definition of “matured knowledge” is for different users in different decks by simulating different cut-off points across which to target different retentions, just like it simulates different retention rates to find the optimal retention now. (I can't even see how it would hurt to have this happen silently under the hood, without the user knowing anything different.)

Barring that, of course, I do think even a lazy and arbitrary single cut-off point somewhere would still go some way to address the reality that the optimal retention to spend minimum time spent is indeed different for, in a broad sense, “things you’re learning” and “things you know”. I struggle to imagine any way an imperfect but partial solution to this could make anything worse.

Expertium commented 2 days ago

Btw, why not just adjust max. interval?

brishtibheja commented 2 days ago

If time needed for R to be .99 is higher than max_interval then you'll be having really unoptimal scheduling.

L-M-Sherlock commented 2 days ago

In fact, if we forget the optimum retention, and just to find the optimum intervals, we will get a gradually increasing retention:

image

The recall probability corresponding to optimal interval increases with half-life and decreases with difficulty, as shown in Figure 9(c). It means that the scheduler will instruct learners to review at a lower retrieval strength in the early stages of memorization, which may be a reflection of "desirable difficulties"[2]. As the half-life increases to the target value, the recall probability approaches 100%. According to the equation Δ𝑡 = −h · log2 𝑝 and the trend of 𝑝 on h, Δ𝑡 is first increasing and then decreasing where the peak emerges.

Source: my paper