thyagoluciano / sm2

SM-2 is a simple spaced repetition algorithm. It calculates the number of days to wait before reviewing a piece of information based on how easily the the information was remembered today.
GNU General Public License v3.0
196 stars 25 forks source link

New Ease Factor only calculated when q >=3 #5

Open johnfarmind opened 4 years ago

johnfarmind commented 4 years ago

https://github.com/thyagoluciano/sm2/blob/5a7081f4141ae1743069579c2a1c450edc8adc05/lib/sm.dart#L25-L26

I think the SM-2 algorithm is not implemented correctly here because EF is not calculated on every review. When a q below 3 is given, the EF is not calculated. This means that when q is between 0-2 it has no effect on the EF what so ever, which is not how the algorithm is intended to work.

The algorithm is intended to lower the EF when the user keeps answering incorrectly (q 0-2). In the current code base, the Ease Factor can never go below 2.5 which is the default value.

alankan886 commented 4 years ago

Hello @johnfarmind , I'm not the owner of this project, but I think I can answer your question.

When a q below 3 is given, the EF is not calculated. This means that when q is between 0-2 it has no effect on the EF what so ever, which is not how the algorithm is intended to work.

So if you go to the official SuperMemo-2 link(https://www.supermemo.com/en/archives1990-2015/english/ol/sm2), it actually mentions that...

If the quality response was lower than 3 then start repetitions for the item from the beginning without changing the E-Factor (i.e. use intervals I(1), I(2) etc. as if the item was memorized anew).

And to answer this part...

The algorithm is intended to lower the EF when the user keeps answering incorrectly (q 0-2). In the current code base, the Ease Factor can never go below 2.5 which is the default value.

Well, if every attempt starting from the beginning are all q < 3, then yes, the EF would stay at 2.5 until q is 3 or larger. But when q < 3, the interval doesn't depend on EF, so the value of EF doesn't affect calculating the correct next review date.

Like the quote above from the official document, it will treat it as it is brand new (uses the base case), so the interval would be 1 (I think you may use the other base case too, which the interval would be 6), which the next review date would be the next day. And this actually follows how spaced repetition works, when not familiar with the subject, review sooner, vice versa.

johnfarmind commented 4 years ago

Well, if every attempt starting from the beginning are all q < 3, then yes, the EF would stay at 2.5 until q is 3 or larger. But when q < 3, the interval doesn't depend on EF, so the value of EF doesn't affect calculating the correct next review date.

This is corrent but you forget a very important thing. When the q is lower than 3, the EF has to become lower so that the subsequent intervals become shorter. If this wasn't the case, the SM-2 algorithm would be pretty useless (or, at least, too simple). The whole idea of the algorithm is that the EF adapts to the users answers and shows a card more often when answered incorrectly. This is the whole idea of a good spaced repetition algorithm works.

Also, if EF can never fall below 2.5, why is the below code needed at all?

https://github.com/thyagoluciano/sm2/blob/5a7081f4141ae1743069579c2a1c450edc8adc05/lib/sm.dart#L33-L35

The code is not implementing SM-2 algorithm in a correct way, sorry to say.

alankan886 commented 4 years ago

Hi @johnfarmind,

EF never falls below 2.5 if q < 3. When q = 3, it drives the EF value down, and it will eventually become lower than 1.3 if q=3 continues to occur, so the part of the code you mentioned does trigger.

It would be great if you can share some of the sources you are following, it would help me understand where you are coming from.

As far as implementing the SM-2 algorithm, I believe the owner of the project is following the official documentation of the algorithm. I don't think SM-2 is the perfect algorithm, there can definitely be improvements to it, but it is showing a card more often when answered incorrectly, although not changing the EF value. And perhaps running static values like interval=1 or interval=6 when answered incorrectly is losing some flexibility and accuracy, but it is the SM-2 algorithm.

ucctheblend commented 3 years ago

According to https://www.supermemo.com/en/archives1990-2015/english/ol/sm2,

If the quality response was lower than 3 then start repetitions for the item from the beginning without changing the E-Factor

However, this seems to contradict the Delphi source code here: https://www.supermemo.com/archives1990-2015/english/ol/sm2source

procedure Repetition(ElementNo,Grade:longint;var NextInterval:longint;commit:WordBool);
var DataRecord:TDataRecord;
begin
    DataRecord:=GetDataRecord(ElementNo);
    with DataRecord do begin
        if Grade>=3 then begin
            if Repetition=0 then begin
                Interval:=1;
                Repetition:=1;
            end
            else if Repetition=1 then begin
                Interval:=6;
                Repetition:=2;
            end
            else begin
                Interval:=round(Interval*EF);
                Repetition:=Repetition+1;
            end;
        end
        else begin
            Repetition:=0;
            Interval:=1;
        end;
        EF:=EF+(0.1-(5-Grade)*(0.08+(5-Grade)*0.02));
        if EF<1.3 then
            EF:=1.3;
        NextInterval:=Interval;
    end;
    if commit then
        SetDataRecord(ElementNo,DataRecord);
end;

Also, if EF is not adjusted when q < 3, then wouldn't it make no difference whether you set q to 0, 1, or 2? If so, what's the point of having these different quality levels, instead of just one failing grade?

alankan886 commented 3 years ago

Hi @ucctheblend,

Thanks for pointing that out, that's interesting.

In that case, I agree that changing EF when q < 3 makes more sense than the other one.

I was planning on grouping responses 0 to 2 when I build my new API, sort of like how Anki only uses 4 responses.


Edit:

I definitely misunderstood the algorithm explanation. The "start repetitions for the item from the beginning without changing the E-Factor" refers to changing the repetitions but keep the newly calculated E-Factor, so we are getting new E-Factor for every quality.