open-spaced-repetition / fsrs4anki

A modern Anki custom scheduling based on Free Spaced Repetition Scheduler algorithm
https://github.com/open-spaced-repetition/fsrs4anki/wiki
MIT License
2.75k stars 137 forks source link

[Beta Testing] FSRS4Anki 4.0.0 Scheduler, Optimizer and Helper. #348

Closed L-M-Sherlock closed 1 year ago

L-M-Sherlock commented 1 year ago
Expertium commented 1 year ago

I suggest capping requested R between 0.75 and 0.97 in the scheduler code. This is needed to prevent the user from choosing insanely high or low values.

L-M-Sherlock commented 1 year ago

This is needed to prevent the user from choosing insanely high or low values.

I agree. But the optimizer sometime suggests R=0.7. For some users with bad memory, they would not achieve 0.75.

Besides, the scheduler would still store the input R. And the helper add-on should also cap it when getting the parameters from the custom scheduling code.

Expertium commented 1 year ago

Sherlock, I recommend you to run the new optimizer on all collections submitted to you via the Google form and average (weighted by the number of reviews) the optimal parameters. This would achieve 2 things: 1) A better set of initial parameters could improve convergence during optimization 2) If the user doesn't have enough data, they will just get average parameters. It should be good enough at first. I suggest introducing a hard limit - 1000 reviews, excluding same-day reviews. If user has more reviews than that, run the optimizer. Otherwise print a warning, like "Insufficient data", and print average parameters.

Expertium commented 1 year ago

Also, it would be great if you implemented what I suggested here.

Expertium commented 1 year ago

As @user1823 suggested, the helper-add on should be compatible with both v3 and v4. In other words, it should be backward-compatible, to ensure that if someone is still using v3 their add-on won't break. Is this implemented?

L-M-Sherlock commented 1 year ago

As @user1823 suggested, the helper-add on should be compatible with both v3 and v4. In other words, it should be backward-compatible, to ensure that if someone is still using v3 their add-on won't break. Is this implemented?

Of course. You can test it with v3 and v4 both.

cjdduarte commented 1 year ago

When I insert this new code "fsrs4anki_scheduler.js"

image

and when running the helper this message appears below

image

Expertium commented 1 year ago

image I'm genuinely curious how I'm getting this error, considering that curve_fit should only run if there are more than 100 reviews for that grade.

L-M-Sherlock commented 1 year ago

and when running the helper this message appears below

Do you install the new helper add-on?

L-M-Sherlock commented 1 year ago

I'm genuinely curious how I'm getting this error, considering that curve_fit should only run if there are more than 100 reviews for that grade.

Could you share the related deck file with me?

Expertium commented 1 year ago

https://drive.google.com/file/d/1WyYRkoZZLllbESz5eCXq7hxFaHAIfrAJ/view?usp=sharing

L-M-Sherlock commented 1 year ago

https://drive.google.com/file/d/1WyYRkoZZLllbESz5eCXq7hxFaHAIfrAJ/view?usp=sharing

image

I find that your cards' cids are weird. They are too small. Normally, the cid is the timestamp of the create date. In the 4.0.0, the optimizer will filter out all cards whose cid is early than 2006 (the year the first Anki version was released).

Expertium commented 1 year ago

I don't know, it's a pre-made deck, I didn't make these cards. I guess whoever made them somehow messed it up, and the creation date is now the earliest date in Unix time.

L-M-Sherlock commented 1 year ago

I don't know, it's a pre-made deck, I didn't make these cards. I guess whoever made them somehow messed it up, and the creation date is now the earliest date in Unix time.

OK, I will remove the filter in the next patch.

perhydrol commented 1 year ago

I have same question. But all cards were made by myself. I sure that all cards were created in 2023. My deck: LinearAlgebra.zip

ValueError                                Traceback (most recent call last)
[c:\Users\telly\Documents\project\fsrs4anki\fsrs4anki\fsrs4anki_optimizer.ipynb](file:///C:/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb) 单元格 11 in 1
      [1](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=0) """
      [2](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=1) w[0]: initial_stability_for_again_answer
      [3](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=2) w[1]: initial_stability_step_per_rating
   (...)
     [16](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=15) https://github.com/open-spaced-repetition/fsrs4anki/wiki/Free-Spaced-Repetition-Scheduler
     [17](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=16) """
     [18](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=17) optimizer.define_model()
---> [19](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=18) optimizer.pretrain()
     [20](vscode-notebook-cell:/c%3A/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.ipynb#X13sZmlsZQ%3D%3D?line=19) print(optimizer.init_w)

File [c:\Users\telly\Documents\project\fsrs4anki\fsrs4anki\fsrs4anki_optimizer.py:518](file:///C:/Users/telly/Documents/project/fsrs4anki/fsrs4anki/fsrs4anki_optimizer.py:518), in Optimizer.pretrain(self)
    515 def S0_rating_curve(rating, a, b, c):
    516     return np.exp(a + b * rating) + c
--> 518 params, covs = curve_fit(S0_rating_curve, list(rating_stability.keys()), list(rating_stability.values()), sigma=1/np.sqrt(list(rating_count.values())), method='dogbox', bounds=((-15, 0.03, 0), (15, 7, 30)))
    519 print('Weighted fit parameters:', params)
    520 predict_stability = S0_rating_curve(np.array(list(rating_stability.keys())), *params)

File [c:\Users\telly\Documents\project\fsrs4anki\.conda\lib\site-packages\scipy\optimize\_minpack_py.py:820](file:///C:/Users/telly/Documents/project/fsrs4anki/.conda/lib/site-packages/scipy/optimize/_minpack_py.py:820), in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, full_output, **kwargs)
    817         xdata = np.asarray(xdata, float)
    819 if ydata.size == 0:
--> 820     raise ValueError("`ydata` must not be empty!")
    822 # Determine type of sigma
    823 if sigma is not None:

ValueError: `ydata` must not be empty!

图片

https://drive.google.com/file/d/1WyYRkoZZLllbESz5eCXq7hxFaHAIfrAJ/view?usp=sharing

image

I find that your cards' cids are weird. They are too small. Normally, the cid is the timestamp of the create date. In the 4.0.0, the optimizer will filter out all cards whose cid is early than 2006 (the year the first Anki version was released).我发现你的卡片的 cid 很奇怪。它们太小了。通常,cid 是创建日期的时间戳。在 4.0.0 中,优化器将过滤掉 cid 早于 2006 年(第一个 Anki 版本发布的那一年)的所有卡。

L-M-Sherlock commented 1 year ago

I have same question. But all cards were made by myself. I sure that all cards were created in 2023.

Your deck only has 400+ reviews, which are not enough to optimize the model. I will add some information about that instead of error raised by the code.

Expertium commented 1 year ago

Your deck only has 400+ reviews, which are not enough to optimize the model. I will add some information about that instead of error raised by the code.

Speaking of which, that's one of the reasons why I suggested this.

L-M-Sherlock commented 1 year ago

Your deck only has 400+ reviews, which are not enough to optimize the model. I will add some information about that instead of error raised by the code.

Speaking of which, that's one of the reasons why I suggested this.

Yeah. And I have add some useful warning in #349:

image
Expertium commented 1 year ago

image I think these features should be turned on by default, rather than off. Also, remove "requires Fuzz" from the description

user1823 commented 1 year ago

I have some feedback to share about the optimizer. (mostly cosmetic changes)

  1. Hide the results of pre-training by default but allow them to become visible if the user desires.
  2. Don't print the init_w after pre-train. The user can wrongly assume that it is the optimized set of parameters.
  3. The S0_rating_curve shouldn't execute at all if the initial stability for all the 4 ratings was calculated accurately (i.e. the data was adequate for all the 4 ratings).
  4. Shift the pre-training to section 2.2. It is more logical to place it there.
  5. Remove the following. They are of no use to the user. image image
  6. Hide the intermediate values of the parameters during the training process. Again, you can allow them to become visible if the user desires. Do the same thing for the graphs in the same section.
Expertium commented 1 year ago

2. Don't print the init_w after pre-train. The user can wrongly assume that it is the optimized set of parameters.

I got confused by that too, it's definitely unnecessary.

Expertium commented 1 year ago

@L-M-Sherlock for some reason when I review new cards, the "Easy" interval is much higher than the corresponding S0 parameter. All decks have this problem. image

Diplproveably commented 1 year ago

This is needed to prevent the user from choosing insanely high or low values.

I agree. But the optimizer sometime suggests R=0.7. For some users with bad memory, they would not achieve 0.75

Besides, the scheduler would still store the input R. And the helper add-on should also cap it when getting the parameters from the custom scheduling code.

I think it's necessary to use a relatively low R in incremental writing; after all, it takes a relatively full dose of forgetting to allow for more creativity. So there's no need to put a limit on the R entered by the user.

L-M-Sherlock commented 1 year ago

@L-M-Sherlock for some reason when I review new cards, the "Easy" interval is much higher than the corresponding S0 parameter. All decks have this problem.

What's your requested retention? And could you share the full configuration with me?

L-M-Sherlock commented 1 year ago

I have some feedback to share about the optimizer. (mostly cosmetic changes)

Done in #350

user1823 commented 1 year ago

@L-M-Sherlock

Reason for adding this message: Let's say that a user didn't notice that a new major version has been released. Then, he used the optimizer for the monthly update of his parameters. Now, if he is not informed of a major change in the scheduler, he would put the parameters into the old scheduler (assuming that he didn't notice an increase in the number of parameters). Then, he will come asking for help regarding the error shown by the scheduler (if it actually shows an error) or regarding incorrect scheduling of the cards (if no error is shown by the scheduler).

user1823 commented 1 year ago

@L-M-Sherlock, just a small nit: The note added with the last patch doesn't look right. Especially, the "Note" part looks odd (perhaps because of the colour). Also, the text is duplicated. It is there in the code as well as the output.

I recommend you to use the following in a text block just after the output parameters.

<font color=orange>Note: These values should be used with FSRS scheduler v4.0.0 or above.</font>

Like this:

user1823 commented 1 year ago

@L-M-Sherlock, one advice regarding the release process.

Release an update to the helper add-on (after it has been tested) before confirming the release for the scheduler and optimizer. This will ensure that most users would have the latest version of the helper add-on when they update the scheduler code.

L-M-Sherlock commented 1 year ago

I recommend you to use the following in a text block just after the output parameters.

OK. I will pick up this suggestion in the next version. But I wouldn't release a patch version just for it.

Release an update to the helper add-on (after it has been tested) before confirming the release for the scheduler and optimizer. This will ensure that most users would have the latest version of the helper add-on when they update the scheduler code.

Of course. I will release the scheduler and optimizer formally after we update the helper add-on and refine the documents.

galantra commented 1 year ago

When I insert this new code "fsrs4anki_scheduler.js"

image

and when running the helper this message appears below

image

I used to get the same error (using Optimizer 4.0.1, Scheduler qt6 v4.0.0 and the Helper from https://github.com/open-spaced-repetition/fsrs4anki/issues/348#issue-1802443582) on 2 out of 4 collections that I tried.

Then I ran the optimizations again with the updated Optimizer v4.0.3, which yielded different parameters. These didn't result in any errors.

edit: I had also updated the Helper from https://github.com/open-spaced-repetition/fsrs4anki/issues/348#issue-1802443582 with the newest change from github. This change encompassed only commit https://github.com/open-spaced-repetition/fsrs4anki-helper/commit/f91d992aa533031d885db80818fdbc06c45e817c, which I think is not related to the error and thus wasn't the change solved it.

L-M-Sherlock commented 1 year ago

I used to get the same error (using Optimizer 4.0.1 and Scheduler qt6 v4.0.0 and the Helper from the first post in this thread) on 2 out of 4 collections that I tried.

Could you provide the completed scheduler code affected by this error?

Expertium commented 1 year ago

@L-M-Sherlock for some reason when I review new cards, the "Easy" interval is much higher than the corresponding S0 parameter. All decks have this problem.

What's your requested retention? And could you share the full configuration with me?

Oh, I'm an idiot. My requested R is 0.8, of course the intervals are gonna be longer.

galantra commented 1 year ago

I used to get the same error (using Optimizer 4.0.1 and Scheduler qt6 v4.0.0 and the Helper from the first post in this thread) on 2 out of 4 collections that I tried.

Could you provide the completed scheduler code affected by this error?

I've just reverted to the configuration I used at the time and tried to reproduce the error. Thanks to https://github.com/sabrogden/Ditto/, I was even able to search in my clipboard history for the exact JS script I used.

Long story short, it was very likely just a syntax error (as far as my insight goes). Because the JS script had two newlines after "w": that shouldn't have been there.

Relevant excerpt ``` // FSRS4Anki v4.0.0 Scheduler Qt6 set_version(); // The latest version will be released on https://github.com/open-spaced-repetition/fsrs4anki // Configuration Start const deckParams = [ { // Default parameters of FSRS4Anki for global "deckName": "global config for FSRS4Anki", "w": [28.65, 29.51, 63.65, 1420.09, 4.5854, 0.1884, 1.1027, 0.1174, 1.686, 0.3616, 1.1866, 2.9257, 0.0174, 0.4557, 1.9994, 0.0001, 3.107], // The above parameters can be optimized via FSRS4Anki optimizer. // For details about the parameters, please see: https://github.com/open-spaced-repetition/fsrs4anki/wiki/Free-Spaced-Repetition-Scheduler // User's custom parameters for global "requestRetention": 0.9, // recommended setting: 0.8 ~ 0.9 "maximumInterval": 36500, // FSRS only modifies the long-term scheduling. So (re)learning steps in deck options work as usual. // I recommend setting steps shorter than 1 day. }, { // Example 1: User's custom parameters for this deck and its sub-decks. // Need to add
to your card's front template's first line. "deckName": "ALL::Learning::English::Reading", "w": [1.2, 2.4, 3.6, 4.8, 5.1483, 1.4221, 1.2282, 0.035, 1.4668, 0.1286, 0.7539, 1.9671, 0.2307, 0.32, 0.9451, 0.5, 2], "requestRetention": 0.9, "maximumInterval": 36500, }, { // Example 2: User's custom parameters for this deck and its sub-decks. // Don't omit any keys. "deckName": "ALL::Archive", "w": [1.3, 1.8, 2.3, 2.8, 4.9532, 1.502, 1.0922, 0.0081, 1.3771, 0.0294, 0.6718, 1.8335, 0.4066, 0.7291, 0.5517, 0.5, 2], "requestRetention": 0.9, "maximumInterval": 36500, } ]; ```

When I removed the newlines, it worked.

Does that seem plausible? @cjdduarte, can you check whether you had a syntax error, too?

L-M-Sherlock commented 1 year ago

image

The helper add-on uses regex to match the values of weights, which can not match this case.

ghost commented 1 year ago

I've got a minor suggestion. For me, uploading an exported deck to Colab often fails without any error description. What I have found after some searching, is that uploading a file using code in the notebook is for some reason much more reliable. I suggest adding a following message and a code block after line "You can export it via File -> Export... or Ctrl + E in the main window of Anki.":

If you have problems uploading your file in Files panel, uncomment the following line and execute it:

# from google.colab import files; _ = files.upload()
user1823 commented 1 year ago

I have got a suggestion too.

Make the optimizer save the analysis for all the first ratings in the Files section of Google Colab (for e.g. how revlog.csv is saved).

It will help in debugging when any user complains that the initial stability doesn't look right. This is especially important because our approach for estimating the initial stability can result in wierd values for some types of data.

Adding the following line of code at the end of the code that does curve fitting for each grade worked for me.

group.to_csv(f'Data for {first_rating}.csv', index=False)

However, it would be better to combine the files into a single CSV file instead of saving 4 different files.

Expertium commented 1 year ago

image I don't think that these stats are necessary to display.

image I think it would be better to have one big progress bar instead of 5 bars (one for each split).

image These graphs should be removed as well. You can keep them in "dev" versions (versions where all of the code is visible and you can edit any of it), but they're just confusing for most users, I suggest only displaying the calibration graph, the one without D and S.

Expertium commented 1 year ago

image Also, as I've mentioned, all 3 of these options should be on by defualt, rather than off. And remove "requires Fuzz", since many users don't even know about that line of code, so it's confusing.

ghost commented 1 year ago

I think that the estimation for Easy initial stability for when there is not enough data is way too optimistic.

first rating: 1
rating history: 1,3,3,3,3,3,3,3,3,3,3
interval history: 0d,1d,3d,11d,1.2m,3.7m,10.5m,2.3y,5.9y,14.2y,32.5y
difficulty history: 0,5.7,5.5,5.4,5.2,5.1,5.1,5.0,5.0,4.9,4.9

first rating: 2
rating history: 2,3,3,3,3,3,3,3,3,3,3
interval history: 0d,1d,4d,14d,1.5m,4.7m,1.1y,2.9y,7.4y,17.7y,40.3y
difficulty history: 0,5.3,5.2,5.1,5.0,5.0,4.9,4.9,4.9,4.9,4.9

first rating: 3
rating history: 3,3,3,3,3,3,3,3,3,3,3
interval history: 0d,7d,25d,2.7m,8.1m,1.8y,4.8y,11.9y,27.8y,61.9y,100.0y
difficulty history: 0,4.8,4.8,4.8,4.8,4.8,4.8,4.8,4.8,4.8,4.8

first rating: 4
rating history: 4,3,3,3,3,3,3,3,3,3,3
interval history: 0d,10.2m,2.4y,6.3y,15.6y,36.4y,80.8y,100.0y,100.0y,100.0y,100.0y
difficulty history: 0,4.4,4.5,4.6,4.6,4.7,4.7,4.7,4.8,4.8,4.8
w = [0.7, 0.82, 6.85, 307.23, 4.8285, 0.4502, 0.503, 0.2325, 1.582, 0.1, 0.9801, 2.1102, 0.0901, 0.304, 1.0834, 0.4167, 2.2577]

There is no chance that I will remember something in ten months after a single review even if I answered "Easy". I think that the maximal value of w3 which I'd personally be comfortable with would be around 20. The problem with this kind of overestimation is that the user may not notice that something is wrong and it will be revealed only after a very long period of time.

Expertium commented 1 year ago

In the other issue I suggested a method that can decrease S0 for "Easy" and make it more realistic, and I also suggested making it so that it cannot exceed 365. We might limit it even more, to something like 30-40, but the problem is that it could hurt material that it genuinely that easy. Although, on the other hand, overestimation could be worse than underestimation. @L-M-Sherlock @user1823 what do you think is a reasonable max value for S0? 30, 40, 100?

L-M-Sherlock commented 1 year ago

Why not just limit the first interval in the scheduler?

Expertium commented 1 year ago

You still have to choose a reasonable max value either way.

ghost commented 1 year ago

I don't think this will hurt very easy cards a lot, because in this case the user will review them and press easy a couple of times. This won't take much of their time, and the intervals would grow very quickly.

Here I set the first rating to Easy to simulate a lower initial stability.

rating history: 3,4,4,3,3,3,3,3,3
interval history: 0,7,50,312,901,2404,5986,14039,31233,36500
difficulty history: 0,4.8,4.4,4.1,4.3,4.4,4.5,4.6,4.6,4.7
L-M-Sherlock commented 1 year ago

I think set the limit to 40 days is OK, because Woz used 40 days as the max interval for https://supermemo.guru/wiki/Cloze_interval

Expertium commented 1 year ago

Ok then, let's add additive smoothing (it makes S0 for "Easy" a bit more realistic) to the final version and update the bounds for S0.

params, covs = curve_fit(power_curve, delta_t, recall, sigma=1/np.sqrt(count), bounds=((0.1), (40)))
rating_stability[rating] = max(min(S0_rating_curve(rating, *params), 40), 0.1)

EDIT: I like user1823's idea. If a user has >1000 reviews for some grade, then set the limit to 100 instead of 40. The rationale is that if the user has a lot of reviews, then we can lift the limit a little bit because the estimate of S will be very accurate.

if sum(count) > 1000:
    params, covs = curve_fit(power_curve, delta_t, recall, sigma=1/np.sqrt(count), bounds=((0.1), (100)))
else:
    params, covs = curve_fit(power_curve, delta_t, recall, sigma=1/np.sqrt(count), bounds=((0.1), (40)))
user1823 commented 1 year ago

Why not just limit the first interval in the scheduler?

Limiting the interval without limiting the stability doesn't make sense. Reasons:

  1. The next interval would still be very large (based on the next stability).
  2. It disturbs the relationship between the stability, the requestRetention and the interval.

Also, as Expertium said, we would still need to think of an appropriate limit.

In my opinion, 60 days should be a reasonable upper limit for S0. Even if 60 days is an underestimate (and the actual value should have been about 200 days), it would still not increase the workload much.

A way to decrease the impact of this "forced" limit could be to remove the upper limit if the number of reviews with that first rating > 1000.

Expertium commented 1 year ago
  1. The next interval would still be very large (based on the next stability).

That's a good catch, I haven't thought about that.

szalejot commented 1 year ago

@L-M-Sherlock Minor bug in the command line version of optimizer. It wasn't updated alongside with v4.0 In the package/fsrs4anki_optimizer/__main__.py file lines 85-86 are:

    optimizer.define_model()
    optimizer.train()

That kinda works, but the first 4 parameters are always [1.0, 2.0, 3.0, 4.0 I suppose it should rather be:

    optimizer.define_model()
    optimizer.pretrain(verbose=False)
    optimizer.train(verbose=False)

to produce correct results.

L-M-Sherlock commented 1 year ago

I don't think that these stats are necessary to display.

These graphs should be removed as well. You can keep them in "dev" versions (versions where all of the code is visible and you can edit any of it), but they're just confusing for most users, I suggest only displaying the calibration graph, the one without D and S.

They are useful debug info. I tend to keep them.

L-M-Sherlock commented 1 year ago

I recommend you to use the following in a text block just after the output parameters.

However, it would be better to combine the files into a single CSV file instead of saving 4 different files.

EDIT: I like user1823's idea. If a user has >1000 reviews for some grade, then set the limit to 100 instead of 40. The rationale is that if the user has a lot of reviews, then we can lift the limit a little bit because the estimate of S will be very accurate.

In my opinion, 60 days should be a reasonable upper limit for S0. Even if 60 days is an underestimate (and the actual value should have been about 200 days), it would still not increase the workload much.

Minor bug in the command line version of optimizer. It wasn't updated alongside with v4.0

Done in #357