[BUG] Potential inconsistency in optimal_retention.rs

Expertium commented 1 year ago

In the current beta, the bounds for desired retention are 0.8 and 0.97. However, it seems that in optimal_retention.rs, they are 0.75 and 0.95 instead.

Unrelated to this issue, but I noticed that changing the number of days to simulate in the beta has such a big impact on optimal retention that just by changing that setting alone, I can make the simulator output any number between 0.8 and 0.95. This makes me question whether the simulator is accurate.

L-M-Sherlock commented 1 year ago

If you increase the the number of days to simulate and keep the deck size unchanged, the optimal retention will increase. It is normal.

L-M-Sherlock commented 1 year ago

@dae, I recommend limiting desired retention within [0.7, 0.97], could you update it in the next beta release?

Expertium commented 1 year ago

Actually, I thought that it's better to change the bounds in optimal_retention.rs to 0.8 and 0.97.

L-M-Sherlock commented 1 year ago

0.8 would be a little high for some users.

user1823 commented 1 year ago

0.8 would be a little high for some users.

Wozniak says:

You can also increase the forgetting index up to 20% to greatly increase the speed of learning at the cost of knowledge retention. Increasing the forgetting index further makes little sense as both retention and the speed of learning will decrease (you can read about it in Theoretical aspects of SuperMemo or see it for yourself by using Toolkit : Statistics : Simulation).

Source: https://help.supermemo.org/wiki/Learning_tab_in_Options

L-M-Sherlock commented 1 year ago

It is possible that Woz is wrong. In my research, the optimal retention for language learning is about 0.75.

user1823 commented 1 year ago

Wozniak also says this:

The greatest overall increase in the optimal interval can be observed for the forgetting index of about 20%. The overall increase takes into the consideration the fact that for forgotten items, the optimal interval decreases. Therefore, for the forgetting index greater than 20%, the positive effect of long intervals on memory resulting from the spacing effect is offset by the increasing number of forgotten items.

The greatest overall knowledge acquisition rate is obtained for the forgetting index of about 20-30% (see Figure 3). This results from the trade-off between reducing the repetition workload and increasing the relearning workload as the forgetting index progresses upward. In other words, high values of the forgetting index result in longer intervals, but the gain is offset by an additional workload coming from a greater number of forgotten items that have to be relearned.

Source: https://super-memory.com/articles/theory.htm

I am not sure how the above two points differ from each other. Perhaps, the 25% FI you suggest is related to the second point.

L-M-Sherlock commented 1 year ago

And Woz also said that:

We used to claim that the best speed of learning can be achieved with the forgetting index of 30-40%.

Source: https://supermemo.guru/wiki/SuperMemo_Algorithm:_30-year-long_labor#Expected_increase_in_memory_stability

Expertium commented 1 year ago

Ok, let's just choose 0.75. What about the upper limit - 0.95 or 0.97 (supermemo uses 0.97)? I think this one is quite important, because changing requested retention from 0.95 to 0.97 will shorten intervals by a factor of about 1.8, that's a big difference.

L-M-Sherlock commented 1 year ago

I recommend setting [0.7, 0.97] as hard limitation (user couldn't input values out of it), and [0.75, 0.95] as soft limitation (the optimizer searches the best retention only in this range).

L-M-Sherlock commented 1 year ago

If the user's cards are too hard, R=0.7 is very common for them. They can set a low retention, but the optimizer would never suggest it.

Expertium commented 1 year ago

Ok, I think that's fine. Just make it clear to Dae that you want two different bounds for the UI in the settings and for the retention optimizer. Btw, I still don't know what exactly it maximizes. Average stability of all cards?

L-M-Sherlock commented 1 year ago

It maximizes the total knowledge retention. In other words, the summation of the retrievability of all cards.

As shown in this figure:

meliache commented 1 year ago

Psychological effects of the retention aren't taken into the account, right? Personally I think that studying with a too low retention (below 75%) might be demotivating because you feel like a failure for forgetting many cards. A too high retention can result in more "easy" cards, which could be boring. In contrast the right level of challenge helps with getting into a flow state. Also the study time does not just depend on the number of reviews but also on the time spend per cards. I would guess that I might spend more time on cards that I cannot answer, because I usually wait around 10 seconds whether the answer might

by any chance spring into my mind. I'm just hypothesizing here based on my personal experience, I didn't read any research on that (would be interested if you know any), but those are my motivations for not using a retention below 80%.

Expertium commented 1 year ago

I agree that retention below 75% would feel demotivating. As for very high retention, it might be useful for those who are preparing for an important exam.

Also the study time does not just depend on the number of reviews but also on the time spend per cards. I would guess that I might spend more time on cards that I cannot answer, because I usually wait around 10 seconds whether the answer might

The simulator takes that into account. It estimates how relatively likely you are to press Hard, Good and Easy (for example, maybe you are 10 times more likely to press Good than Easy), and how the time spent depends on the answer; it uses 4 different values for Again, Hard, Good and Easy.

L-M-Sherlock commented 1 year ago

Personally I think that studying with a too low retention (below 75%) might be demotivating because you feel like a failure for forgetting many cards

I add loss_aversion to increase the cost of forgetting in the python optimizer: https://github.com/open-spaced-repetition/fsrs-optimizer/blob/94be1a22d93e84e121186ea7181f55bc826639d3/src/fsrs_optimizer/fsrs_optimizer.py#L1180-L1180

By default, the optimizer will multiply forget_cost by 2.5. But it's still experimental, I'm not sure whether add it into fsrs-rs.

szalejot commented 1 year ago

Regarding this statement:

If you increase the the number of days to simulate and keep the deck size unchanged, the optimal retention will increase. It is normal.

I have a situation where I have a deck for long time learning - personal language learning, so potentially for many years. I am not adding many cards to it. In such a situation what simulation length is advised? Put it to maximum? And deck size settings? I can put the current value, estimated value after a year (estimating how many cards I will add in a year) or something in between.

Expertium commented 1 year ago

@szalejot https://github.com/open-spaced-repetition/fsrs-rs/issues/81#issuecomment-1734696475

Expertium commented 1 year ago

@L-M-Sherlock I think loss aversion should only apply when the (simulated) user is pressing "Again". Not many people are averse to spending time on a card to get it right, but many people are averse to spending time on a card just to get it wrong. In other words, loss aversion should only increase the cost for forgotten cards.

L-M-Sherlock commented 1 year ago

I think loss aversion should only apply when user is pressing "Again".

The current code has worked as you said.

open-spaced-repetition / fsrs-rs

[BUG] Potential inconsistency in optimal_retention.rs #79