Closed mustafa0x closed 5 years ago
You have no choice but to omit cards lacking models from the sorted search, right? So I schedule those in the order that they’re listed in my database (flat file, time stamp added, etc.)—Ebisu doesn’t get involved with that aspect.
Right.
So if the user reviews four cards, and gets half of them right, should all four be moved to the "bottom" of the stack, below the cards that have no model (so we can't predict their recall)? How about if the user leaves the app, and comes back in two weeks?
Or am I overthinking this?
Maybe :) I don't keep a stack—each time the user asks for a quiz, I call predictRecall
on all the cards with models and find the card with lowest recall probability. This becomes important as you get a mix of old and recently-learned cards—the newly-learned cards' recall probabilities decay much faster than those of old ones, so cards pass each other on that notional stack.
I have various ideas how to ameliorate the computational waste of scanning over all learned cards each time you want to quiz—the most straightforward one being, when you update a card's model, you also calculate and store the timestamp at which the recall probability will be 50%, 5%, 1%, 0.1%, 0.01%, and do some SQL or binary search magic to find the set of cards with approximately the lowest recall probabilities given the actual timestamp, and pick one of those to quiz.
(Ensuring you quiz on the card with the absolute lowest recall probability is not that critical. You want to avoid patterns in quizzes—like, if you learn A, then B, then C, and each time you're quizzed on A you know you'll be asked B and C next, that's bad. So bucketing quizzes into approximately the same recall probability is nice from a user perspective and a computational one.)
Does this all make sense? This is how Curtiz works.
In your example, if you learn/review four cards, potentially failing some reviews (you never 'fail' an initial learn right, you just set the model of a newly learned card to defaultModel
), and have other cards you haven't learned, then come back later, you run predictRecall
on all four cards to find one with low (maybe lowest) recall and quiz that, then rinse and repeat until you get tired of quizzing the same four cards and learn more cards.
Thank you so much for the explanation. My app's scheduling workflow however still has glaring holes.
Things are clear cut when all cards have been reviewed at least once — simply sort the deck of cards with predictRecall
. Similarly, when all cards are new, just present them in whatever order the app has them stored in.
The intermediary states however are a bit more confusing to me.
defaultModel
be assigned to a card before or after the first time it's studied?Thank you, and I apologize for the barrage of questions! 🙃
No worries!
Perhaps something I've not explicitly noted is the choice you have in mixing quizzing vs learning. In Curtiz, those are two separate modes of usage: when quizzing, you only look at the flashcards you've already learned and quiz on the lowest recall probability ones, over and over; and when learning, Curtiz introduces flashcards that you haven't seen. In this way, Curtiz actually partitions the set of cards into learned vs unlearned, leaving you with exactly the two situations you described.
But you can do something else too—you could set a threshold for memory recall, say 50%, and quiz until all learned cards have predicted memory recall >50%, and then switch to learning new cards.
Does that make sense? I'll answer your specific questions now—
Why can't an initial study be marked as fail? Look at it this way: for any flashcard, whether newly learned or an old one, your quiz app stores (1) an Ebisu model and (2) a timestamp for which that model applies—if a, b, t = model
, you're saying that recall at time timestamp + t
is Beta(a,b)
-distributed. The model has to come from defaultModel
or from updateModel
, the two functions the Ebisu API provides, but updateModel
takes a model as an argument, which for new cards, you lack. So when your user indicates they've committed a card to memory, that's when you store the timestamp and a defaultModel
, to encode your belief in their ability to recall.
Your app could certainly adopt the Anki mechanism you describe—
model = defaultModel()
and timestamp)updateModel(model, ...)
)but in terms of the API Ebisu provides, the first time your user indicates they've learned a flashcard, that's not a quiz that can be graded (since you don't have an existing Ebisu model).
It is possible to simply select a default half life that is sensible Absolutely. In Curtiz, we used to do exactly this: all newly-learned cards had default half-life of 15 minutes. But then we allowed users to scale that manually if they want to, because it's frequently the case that you are much more familiar with some cards than others. You still want the SRS to remind you of something you know well, just not with the same frequency as something you've never seen before.
Should a defaultModel be assigned to a card before or after the first time it's studied? Definitely when the user indicates they've committed it to memory—so at the first time it's studied, and certainly not before. Recall that your app needs to store a model and timestamp for each learned card, because the memory decay encoded by the model starts at the timestamp. After you quiz, call updateRecall
, and update your model with its return value and the quiz's timestamp. The new memory decay begins then. Before a card has been studied, it's nothing to Ebisu.
You said: "rinse and repeat until you get tired of quizzing the same four cards and learn more cards." — How could this be where the app determines whether to present a recently studied card or an entirely new one? I tried to answer this above: if your app's design interleaves quizzing and learning, then you'd set a minimum threshold on recall probability, and if no flashcards are below that, you ask the user to learn new things. But this is totally your app's decision and not something Ebisu cares about. Ebisu tries (I hope!) to be unopinionated about everything about your app, which is why there's just the three functions it exports:
defaultModel
for when a flashcard is freshly learned,predictRecall
for deciding whether/which to review, andupdateRecall
after a review happens, when you have a boolean quiz result.Aside You can also update cards passively, when you don't have a "quiz" as such. A simple example might be cloze-deleted card: (Wellington) is the capital of (New Zealand)
has two Ebisu models, one for each cloze. When you first learn the flashcard, both models get set via defaultModel
. And you can only pick one to quiz, so you quiz "___ is the capital of New Zealand". Now, you go to update Wellington's Ebisu model with updateRecall
—this is totally normal. But what about New Zealand's model? Certainly by doing this quiz, you've also practiced "New Zealand" too, and it's wrong to pretend it's memory is decaying just as before. So this is a passive review for New Zealand: you just update New Zealand's timestamp to be the same timestamp as the quiz, thereby resetting its decay. (You don't modify it's model, since you have no evidence that the user would have remembered or forgotten it if it was an active review and they had to summon "New Zealand" from their memory.)
What this means is that, assuming the user answered "Wellington" correctly, Wellington's memory is strengthened and decays more slowly than before, from the timestamp of the quiz. But so too does New Zealand's memory: it's refreshed and decaying anew, with its original model's rate of decay.
This is really nice because it can greatly reduce the burden of review. I'm working on an app that builds a dense dependency graph between Ebisu models: "when I review this sentence, I'm also indirectly reviewing all the vocabulary in the sentence; the overall sentence's Ebisu model gets updated via updateRecall
but all the individual vocab cards' models' timestamps also get reset."
This only makes sense if you see the Ebisu model as modeling memory decay starting from a timestamp. I don't think there's an equivalent to this notion in the Anki methodology so I'm happy to explain this (or any other aspect) further, don't hesitate to ask.
(Note, I edited the previous comment on GitHub for clarity and apologize that the original in the email may be a bit incomprehensible in some places.)
Thank you for the thorough explanation, the threshold idea seems to be working decently. 🎉 I'll keep you updated of any new developments.
What's a good way to deal with new cards (i.e. that haven't yet been reviewed)? How to schedule them when sorting the deck?