Closed Expertium closed 2 months ago
I generate a new one with FSRS-5:
Huh. That's strange, here the workload above 95% is much lower.
Do we replace the graph in manual? Actually, I think we should smoothen the lines a bit. Less accuracy but looks better when it's a beautiful curved line.
@L-M-Sherlock sorry for bothering you, but would you please generate it with more precision? Instead of incrementing desired retention by 1%, increment it by 0.2%. And increase the sample size for the simulation.
OK. Here you are:
I wanted to give you some more suggestions, but I think it's better if you just give me the raw data and I plot it myself
I didn't save the raw data. You can plot this image via this commend in fsrs-optimizer/
:
python ./src/fsrs_optimizer/fsrs_simulator.py
Please save it and send it to me
[0.7, 0.701, 0.702, 0.703, 0.704, 0.705, 0.706, 0.707, 0.708, 0.709, 0.71, 0.711, 0.712, 0.713, 0.714, 0.715, 0.716, 0.717, 0.718, 0.719, 0.72, 0.721, 0.722, 0.723, 0.724, 0.725, 0.726, 0.727, 0.728, 0.729, 0.73, 0.731, 0.732, 0.733, 0.734, 0.735, 0.736, 0.737, 0.738, 0.739, 0.74, 0.741, 0.742, 0.743, 0.744, 0.745, 0.746, 0.747, 0.748, 0.749, 0.75, 0.751, 0.752, 0.753, 0.754, 0.755, 0.756, 0.757, 0.758, 0.759, 0.76, 0.761, 0.762, 0.763, 0.764, 0.765, 0.766, 0.767, 0.768, 0.769, 0.77, 0.771, 0.772, 0.773, 0.774, 0.775, 0.776, 0.777, 0.778, 0.779, 0.78, 0.781, 0.782, 0.783, 0.784, 0.785, 0.786, 0.787, 0.788, 0.789, 0.79, 0.791, 0.792, 0.793, 0.794, 0.795, 0.796, 0.797, 0.798, 0.799, 0.8, 0.801, 0.802, 0.803, 0.804, 0.805, 0.806, 0.807, 0.808, 0.809, 0.81, 0.811, 0.812, 0.813, 0.814, 0.815, 0.816, 0.817, 0.818, 0.819, 0.82, 0.821, 0.822, 0.823, 0.824, 0.825, 0.826, 0.827, 0.828, 0.829, 0.83, 0.831, 0.832, 0.833, 0.834, 0.835, 0.836, 0.837, 0.838, 0.839, 0.84, 0.841, 0.842, 0.843, 0.844, 0.845, 0.846, 0.847, 0.848, 0.849, 0.85, 0.851, 0.852, 0.853, 0.854, 0.855, 0.856, 0.857, 0.858, 0.859, 0.86, 0.861, 0.862, 0.863, 0.864, 0.865, 0.866, 0.867, 0.868, 0.869, 0.87, 0.871, 0.872, 0.873, 0.874, 0.875, 0.876, 0.877, 0.878, 0.879, 0.88, 0.881, 0.882, 0.883, 0.884, 0.885, 0.886, 0.887, 0.888, 0.889, 0.89, 0.891, 0.892, 0.893, 0.894, 0.895, 0.896, 0.897, 0.898, 0.899, 0.9, 0.901, 0.902, 0.903, 0.904, 0.905, 0.906, 0.907, 0.908, 0.909, 0.91, 0.911, 0.912, 0.913, 0.914, 0.915, 0.916, 0.917, 0.918, 0.919, 0.92, 0.921, 0.922, 0.923, 0.924, 0.925, 0.926, 0.927, 0.928, 0.929, 0.93, 0.931, 0.932, 0.933, 0.934, 0.935, 0.936, 0.937, 0.938, 0.939, 0.94, 0.941, 0.942, 0.943, 0.944, 0.945, 0.946, 0.947, 0.948, 0.949, 0.95, 0.951, 0.952, 0.953, 0.954, 0.955, 0.956, 0.957, 0.958, 0.959, 0.96, 0.961, 0.962, 0.963, 0.964, 0.965, 0.966, 0.967, 0.968, 0.969, 0.97, 0.971, 0.972, 0.973, 0.974, 0.975, 0.976, 0.977, 0.978, 0.979, 0.98, 0.981, 0.982, 0.983, 0.984, 0.985, 0.986, 0.987, 0.988, 0.989, 0.99, 0.991, 0.992, 0.993, 0.994, 0.995, 0.996, 0.997, 0.998, 0.999]
[97.5934280182195, 97.22824825346852, 97.47904545431524, 97.38849940771273, 96.89665108713663, 96.97587965861355, 96.8954734913512, 97.17488099787234, 97.09549804437478, 96.92419265085283, 97.0723594461214, 97.33450712188522, 97.02641115077162, 97.04051118660472, 96.84477698494457, 97.00333530555795, 96.59024878971599, 96.46158541334, 96.77870873308817, 96.33635197571016, 96.28073051490945, 96.45379767189873, 95.56571126641754, 95.99446280980632, 95.44502382732261, 95.48043149727145, 95.52854123684439, 95.40444161737847, 95.6274230781238, 95.07976763049297, 95.1095658580644, 95.85664168814118, 95.01708471063768, 95.00870522540848, 95.09258811239553, 94.88334538953333, 94.79684600602019, 94.87871377393992, 94.5463732630096, 94.74179944361317, 94.34859437131432, 94.63016003785597, 94.55026529204838, 94.59903473115617, 94.6210886214512, 94.00962965480767, 94.13018057437144, 93.77080243363292, 93.7163017331956, 93.84853786351223, 93.83777726029169, 94.32492751589335, 94.31680942115452, 94.15257189993481, 93.77582020788026, 93.54283251546505, 93.3613757784776, 93.49282297965098, 93.36921640528236, 93.32189454916457, 93.26798468886126, 93.188367906505, 92.86059593675029, 92.93728964312615, 92.9528009709936, 92.64054381718285, 92.8362598451921, 92.71635778589169, 93.27281456112111, 92.50635160655902, 92.64291708655259, 92.09539175990135, 92.45258704638809, 92.12600240804659, 92.59834674139785, 92.18587434383215, 92.42340322992365, 92.43914584156758, 92.40961290295726, 92.24950632027841, 92.34689118368196, 91.89322185577767, 91.7365509486127, 91.78315579598251, 90.50086484798047, 90.29225407015403, 90.7607283190398, 90.61409114770017, 90.42900383644874, 90.2207822579684, 90.10858289524069, 89.83325221959075, 90.15277094296795, 90.2722375895759, 90.15603682657058, 90.08213006044983, 89.82991119028608, 89.91511934474055, 89.99907829375917, 89.66540793477301, 89.58474115163654, 89.74173472101289, 89.7337336991589, 89.73096402081934, 89.29526887240687, 89.40918645707917, 89.5469637628481, 89.43824006510332, 89.63033508434893, 89.28756134776715, 89.37240376896095, 89.49971675764945, 89.70972691619845, 89.48624997939766, 89.42301177744007, 88.98806639067132, 88.707568558238, 88.86554655089608, 88.95372974404759, 88.64816747575776, 88.74311608291978, 88.68014427089543, 88.92748397863923, 88.8154226740526, 88.57998990054769, 88.98985157117619, 88.84750647421822, 89.19278684444261, 89.23801810570325, 89.20268075055274, 88.8600208509342, 88.6662182591752, 88.75881546118052, 88.9495538494976, 88.78371660672042, 89.08442706992818, 88.50736893721887, 88.66514979648522, 88.41249328859948, 88.49979130773977, 88.6370930118383, 88.36957660846551, 88.10368896217591, 88.45277232779695, 88.16358147276628, 88.6406659446418, 88.69085889191464, 88.36005794250866, 88.43354547051182, 88.25110969518614, 88.47739561338372, 87.88075464690971, 88.31288952624402, 88.4940479599122, 88.16780918811057, 88.58736248385237, 88.60940981936928, 88.44202971149969, 88.51578404796757, 88.37808301861833, 88.2143412352401, 88.07286900699586, 88.48913625473534, 88.33042591533766, 88.17443933669148, 88.83644862318157, 88.34245714711554, 88.40080494113107, 88.35639263794064, 88.09356256431454, 88.21074137110152, 88.27142683410567, 88.06500754531415, 88.45312335149825, 88.53912857593826, 88.46873028364003, 88.44623439053271, 88.5122240389399, 88.41054408398071, 88.68023732967228, 88.528415520447, 89.06992092073114, 88.58827161445441, 88.4173360938752, 88.99075300234243, 88.76457196794904, 88.87573689315089, 88.8244157380859, 88.98322874268896, 89.1921322643156, 89.08287623632877, 89.08769734745262, 89.18487503668048, 89.13747837779329, 89.39918097585904, 89.25281181911149, 89.63463750616785, 89.25209634972086, 89.68876408705577, 89.39326850957194, 89.50649376643932, 89.85860470033487, 89.99428659217786, 89.9999490280992, 89.84354974542214, 90.35537615772066, 90.23342504016772, 90.46990705030105, 90.66923587363884, 90.51141002023806, 90.67231077465712, 91.05504060392587, 91.26856878128729, 91.04673638343365, 91.14891615551986, 91.56781359332442, 91.36434509271137, 91.92556659430839, 92.25805811720205, 91.9848045529049, 92.33333882079523, 92.5165874143888, 92.87603497434782, 93.07599691280834, 92.8423060845239, 93.5061310702842, 93.5129070965123, 93.51938492794136, 94.17305748632043, 94.58626009214171, 94.59581020155792, 94.6885791924855, 94.77732175638613, 95.75139288773207, 95.8384563609055, 96.1575868028406, 96.68298290495807, 97.21959832270169, 97.44847982112671, 97.83864026015628, 98.53244848026135, 98.6655652071842, 99.3774616972807, 99.65512511778843, 99.8512396352247, 100.71178475586936, 100.72225608498191, 101.28710243350547, 101.98467740782993, 102.58899588718873, 102.69630468983948, 103.59938589051414, 104.25775142972208, 104.89464251400084, 106.07472232854309, 106.4519935248136, 107.23612860818945, 107.52001510757825, 109.72320817328794, 110.67377207560642, 111.29480978623184, 112.77448950347573, 113.39335728726923, 114.60335207201126, 116.34872987782661, 117.92263672140099, 119.48985818050531, 120.29063511377585, 122.18755414096523, 123.32513447114098, 125.4504692981476, 127.02867002075851, 128.64752260390233, 130.74548998315402, 132.69467248793634, 134.81973955921836, 137.8998852443991, 140.37970176553554, 143.79600888057018, 147.02870698854298, 150.72681059337805, 154.4413908666193, 158.39263972802354, 163.46325679269574, 169.8227711725536, 174.5582720724376, 181.89113067601488, 188.24280917078806, 198.66379626485346, 207.92989381366408, 220.6053764585805, 234.5349361053963, 252.3737059029466, 273.2480610265181, 301.44470992194675, 339.80409546448675, 392.1117452474332, 470.4931182969767, 614.5668727227176, 958.8945495647536]
This was much harder than I thought. I had to do a whole bunch of math + crude hacks. I am NOT sharing the code for this abomination.
Anyway, it seems that it is, indeed, quite different from the last one. The workload doesn't increase as steeply as we previously thought, and the optimal value is around 0.85 rather than 0.75. This suggests that previous estimates of optimal retention were underestimates. This also suggests that we should increase the range of output values of CMRR from 0.95 to 0.97 or 0.98.
Anyway, it seems that it is, indeed, quite different from the last one.
Do you have any intuitive explanation of why it might be so different? (If we can't think of any logical reason, I would tend to believe that this shift results from a bug or a bias.)
The workload doesn't increase as steeply as we previously thought
It's caused by the short-term memory module. FSRS-5 allows the stability to increase when the retrievability is 100%.
And also the fact that the simulation accounts for the time it takes to do a same-day review.
Ok, then it looks good to me.
But, it would be better if the red region is also shown at the right side of the graph. Based on a quick glance at the data, it seems that the red region will start from DR = 0.993
I don't think there is a point in doing that, since in Anki you can't set DR above 0.99, not even a little bit.
Can we make the word retention in graph desired retention? It can be used in the manual in the topic titled "Desired Retention". Expertium, please?
LMSherlock already merged it in the guide. Man...sigh fine.
I don't think there is a point in doing that, since in Anki you can't set DR above 0.99, not even a little bit.
Fair point.
By the way, I am still slightly skeptical about the new graph. According to this graph, 0.97 is in the green zone. But, if I use the FSRS previewer with the default FSRS 5 parameters and DR = 0.97, I get this result:
rating history: 3,3,3,3,3,3,3,3,3,3,3 interval history: 0d,1d,1d,2d,3d,4d,6d,8d,11d,15d,20d
Aren't the intervals too short? Should we advise users to set such a high value of DR?
Can we make the word retention in graph desired retention?
Good suggestion.
Colors are kinda arbitrary. On this graph, green is just "between min. workload and 2*min. workload".
Do you mean to say that the definition of the green, yellow and red zones should be updated?
I mean to say that those definitions are arbitrary. I'm not sure if they need to be updated, though.
Can we do a gradient transition instead of the solid blocks of colour?
I think that the yellow zone should start where the slope of the graph increases drastically. I am not sure of the best way to find that point. But, on doing a crude analysis with Excel, I think that DR = 0.96 should be the transition point.
Can we do a gradient transition instead of the solid blocks of colour?
Oh god, that would be a nightmare.
I think that DR = 0.96 should be the transition point.
Let's just say it starts at min. workload x1.5.
You know, I think most people probably don't get these colours. I am aware that I am a sample of one, but I really had no idea what these were signifying, particularly the yellow colour that is shoehorned between other colours. It's a bit hard to make head and tails of that.
Red is "whatever you are doing, please stop" Yellow is "ok, but you can get a better workload/retention ratio if you change desired retention a little" Green is "nice"
Green/yellow doesn't seem too different. If you don't like gradient, you can use same colours of different shades to signify that.
Meh, I think the latest iteration looks fine.
You're smart, I'm not /s
Edit: can you at least try? they might just look better. three completely different colours is confusing. I had to look into this real close when updating the manual, to actually understand it.
I'd rather we discuss whether changing the output range of CMRR is a good idea instead of discussing the color of the bike shed workload-retention graph.
But maybe we should add this, too?
@user1823
Why is this graph different? It starts slanting from a different point.
Read what the Y axis says, and what the text above the black horizontal lines says 😉
😮 cool. CMRR value seems much higher than expected.
It's cool, indeed. But I'm somewhat worried that users will be confused by two different graphs.
That? Majority of people will be confused from workload:retention being in x-axis. Having two variables together like this complicates it (I remember I used to find them so hard).
@L-M-Sherlock the graphs here also look like they need to be updated: https://github.com/open-spaced-repetition/fsrs4anki/wiki/The-Optimal-Retention
But maybe we should add this, too?
It's unnecessary to divide the workload by retention because the code has taken it into account.
Wait, I'm confused. So the original graph in this comment doesn't show workload, but rather workload divided by knowledge? Then it's not the same at all! For the guide, we need a graph with just the workload. So we almost used the wrong graph.
Because I use the same function used by CMRR.
def workload_graph(default_params):
R = [x / 100 for x in range(70, 100)]
cost_per_memorization = [sample(r=r, **default_params) for r in R]
...
def sample(
r,
w,
deck_size=10000,
learn_span=365,
max_cost_perday=1800,
learn_limit_perday=math.inf,
review_limit_perday=math.inf,
max_ivl=36500,
learn_costs=DEFAULT_LEARN_COSTS,
review_costs=DEFAULT_REVIEW_COSTS,
first_rating_prob=DEFAULT_FIRST_RATING_PROB,
review_rating_prob=DEFAULT_REVIEW_RATING_PROB,
first_rating_offset=DEFAULT_FIRST_RATING_OFFSETS,
first_session_len=DEFAULT_FIRST_SESSION_LENS,
forget_rating_offset=DEFAULT_FORGET_RATING_OFFSET,
forget_session_len=DEFAULT_FORGET_SESSION_LEN,
loss_aversion=2.5,
):
memorization = []
if learn_span < 100:
SAMPLE_SIZE = 16
elif learn_span < 365:
SAMPLE_SIZE = 8
else:
SAMPLE_SIZE = 4
for i in range(SAMPLE_SIZE):
_, _, _, memorized_cnt_per_day, cost_per_day = simulate(
w,
r,
deck_size,
learn_span,
max_cost_perday,
learn_limit_perday,
review_limit_perday,
max_ivl,
learn_costs,
review_costs,
first_rating_prob,
review_rating_prob,
first_rating_offset,
first_session_len,
forget_rating_offset,
forget_session_len,
loss_aversion,
seed=42 + i,
)
memorization.append(cost_per_day.sum() / memorized_cnt_per_day[-1])
return np.mean(memorization)
Is cost_per_day just time, or time/knowledge? That's important.
cost_per_day
is just time.
So this graph is just workload, right? I really wanna make sure we're not making the wrong graph. The graph in the guide should have workload, NOT workload divided by retention. I mean, we could add both, but the original one was workload, not workload/knowledge (or retention). So @L-M-Sherlock, are you really sure that the data you gave me here is workload, and not workload/knowledge (or workload/retention)?
So this graph is just workload, right?
Nope. All of previous graph generated by the optimizer is cost/memorization. If you need the workload graph, I will refactor the function.
We will need it for https://github.com/ankitects/anki-manual/pull/263
On a different note, can we explain people the significance of the current graph in simple words?
If you need the workload graph, I will refactor the function.
Yes, we need workload graph for the tutorial.
Also, if the current graph plots cost/memorization, its labels (y axis and the dashed horizontal lines) should be corrected.
@L-M-Sherlock Yep, we need just workload
Add this right before plt.show():
plt.tight_layout(h_pad=0, w_pad=0)
To remove all the unnecessary white space around the graph. Or just send me the data, I'll also make it look smoother (and make sure R goes in increments <1%, like 0.1% or 0.2%)
This graph was generated before FSRS-5, when same-day reviews weren't accounted for. I believe that it needs to be re-made, since same-day reviews likely increase the workload and therefore affect the graph.