fasiha / memorize-py

Pure-Python library implementing the Memorize algorithm for flashcard quiz scheduling
The Unlicense
8 stars 2 forks source link

Example #1

Closed dbof10 closed 4 years ago

dbof10 commented 4 years ago

Do you have an example how to use? or maybe how to combine with your Ebisu? Mostly I don't understand q and T params. Can q just be either q=0.1 to q=1.0 to q=10.0? Can you help explain the T?

  1. if I use rng=None, is the output idempotent?
  2. Can you help with java version as well?
dbof10 commented 4 years ago

Hello, can you help me with the questions?

fasiha commented 4 years ago

Hey sorry for the delay. I'm not super-super-excited about using Memorize because of how it randomly-samples from the optimal review process. (That is, after it sets up the ideal probability distribution for a review schedule, it then has to actually create an actual concrete review schedule. It does that by randomly drawing a schedule from the probability distribution.)

In short, if you call the function ten different times, it'll propose ten different due dates. I'm not really sure that's much better than scheduling when probability of recall drops <50%. In theory, Memorize is better because it gives you an extra tuning parameter q which ensures that you don't review the same thing too often—but is the random nature of Memorize the best way to do that? I'm not sure how as a user I'd like that, so I haven't played with it much.

(Recall that I personally don't schedule reviews at all. I review as much or as little as I want to at any given time, and rely on Ebisu to present me with the cards with the lowest recall.)

Do you have an example how to use? or maybe how to combine with your Ebisu? Mostly I don't understand q and T params. Can q just be either q=0.1 to q=1.0 to q=10.0?

I don't have a good sense of what to set q to in practical quiz apps. In numeric terms, the average value returned by schedule will be sqrt(q). So q=0.1 means average review time will be sqrt(.1)=.32 hours or 19 minutes for very low recall probability items. Similarly q=10 means average review time will be 190 minutes (=sqrt(10)) for very-low-recall-probability cards.

Of course you want q to be small so cards with low recall probability are scheduled as soon as possible. But if you make q=0.1, your high recall probability cards also get scheduled sooner… what's the right balance? We have to try it out with a quiz app and see… the authors weren't able to do that with live students so we have to do that.

Can you help explain the T?

T is the max value the function will return. Set it to infinity to allow it to schedule arbitrarily far into the future. I thought about setting the default value for T be infinity but decided to require users to provide a max, that's a bit safer than potentially spending a long time calculating the value…

  1. if I use rng=None, is the output idempotent?

No, the opposite: if rng=None the function uses the system random number generator stream, so the return values will be random each call. If each call you use rng=Random(123) (for seed 123), then the function is idempotent, returning the same values each time.

  1. Can you help with java version as well?

If you promise to use it in your quiz app and tell me whether you find Memorize is better than "review when probability drops 50% (or 10% or whatever)", I'll help port it to Java :).

The way to use this with Ebisu is super-simple:

from memorizesrs import schedule
import ebisu

model = ebisu.defaultModel(0.25) # initially 15 minute half-life
timeSinceLastReview = 0.5 # half-hour since last learned/reviewed

dueIn = schedule(lambda t: ebisu.predictRecall(model, t + timeSinceLastReview, exact=True), q=1.0, T=100.)
# dueIn is "quiz is due in dueIn hours from now"

If you play with timeSinceLastReview and q you'll see that the algorithm might suggest scheduling a card you just learned before a card you learned a while ago.

fasiha commented 4 years ago

Of course you want q to be small so cards with low recall probability are scheduled as soon as possible. But if you make q=0.1, your high recall probability cards also get scheduled sooner… what's the right balance? We have to try it out with a quiz app and see… the authors weren't able to do that with live students so we have to do that.

There's lot of things you might do with Memorize other than just blindly scheduling the review in the future. You might group quizzes into hourly buckets and then pick a card at random from that bucket every hour.

You might use recall probability as a floor ("if any card has recall probability <0.01, review it as soon as possible").

Or you could just forget about scheduling reviews 😁

dbof10 commented 4 years ago

Thanks for the very good explanation. Currently, I'm using ebisu in my quiz app with probability < 50%. if I want to use Memorize, how do we compare Memorize to probability < 50%? which metrics should we consider in this a/b testing?

dbof10 commented 4 years ago

Since it's a stochastic algo, can I use rng=Random(123) to make it idempotent and predictable?

fasiha commented 4 years ago

No :)

fasiha commented 4 years ago

Logging here for future use: https://gist.github.com/fasiha/e46a499ba564d0246213168016554848 has Python code to convert a history of reviews (from any quiz app) to first an Ebisu model and then finally to Memorize intervals.

import ebisu
from memorizesrs import schedule
from datetime import datetime, timedelta
from math import inf

learnTime = datetime(2020, 1, 27, 0, 0, 0)
reviewTimes = [
    learnTime + timedelta(hours=1.1), learnTime + timedelta(hours=4.4),
    learnTime + timedelta(hours=22.0), learnTime + timedelta(days=2.4)
]
reviewResults = [False, True, True, True]
initialModel = ebisu.defaultModel(0.25, 2.0)  # initial half-life = quarter-hour, α=β=2

def historyToEbisuModel(learnTime, initialModel, reviewTimes, reviewResults):
  """Convert history of quizzes to Ebisu model"""
  assert len(reviewTimes) == len(reviewResults)
  previousTime = learnTime
  model = initialModel
  for (reviewTime, result) in zip(reviewTimes, reviewResults):
    model = ebisu.updateRecall(model, result, (reviewTime - previousTime).total_seconds() / 3600)
    previousTime = reviewTime
  return model

def ebisuModelToMemorizeInterval(model, q=10.0, T=inf):
  """Use an Ebisu model to draw a MEMORIZE interval"""
  return schedule(lambda t: ebisu.predictRecall(initialModel, t, exact=True), q, T)

ebisuModel = historyToEbisuModel(learnTime, initialModel, reviewTimes, reviewResults)
dueIn = ebisuModelToMemorizeInterval(ebisuModel, q=10.0, T=inf)
print("Quiz due in {} hours".format(dueIn))

try:
  import pandas as pd
  print("\n\nMonte Carlo analysis:\n")
  for model in [initialModel, ebisuModel]:
    print("Ebisu halflife={} hours".format(ebisu.modelToPercentileDecay(model)))
    print(pd.DataFrame([ebisuModelToMemorizeInterval(model) for n in range(10_000)]).describe())
except:
  print("Didn't run pandas analysis")
dbof10 commented 4 years ago

hey thank you for the example. memorizesrs is just a few lines. it would be great if you help me convert it to java.

fasiha commented 4 years ago

Ok sure, I think the following pointers might be useful—with these, I think you can probably put together a Java implementation faster than me :):

Sorry for taking so long to come back on this, this time I want to make sure you have everything you need to implement a Java version of this, let me know if anything remains unclear.

I'm really sorry this repo doesn't include any tests—since it's a stochastic algorithm, tests would be really helpful when checking ports, but I think since the code is so simple, a mechanical port should be fine.

I'm happy to review the Java code you come up with for correctness.

If you want me to write the Java implementation, it'll take me a few days.

fasiha commented 4 years ago

if I want to use Memorize, how do we compare Memorize to probability < 50%? which metrics should we consider in this a/b testing?

This is a great question and I'm not sure what the answer is. Ideally, you'd test on a large population of users on a narrow set of flashcards and see which algorithm helps more people remember things better for longer 😅. But on a personal level, my ideas about validating which approach is better are probably as vague as yours.

dbof10 commented 4 years ago

For java, expovariate

public double getNext() {
    return  Math.log(1-rand.nextDouble())/(-lambda);
}

what is lamda in this case? where λ is the rate parameter of the exponential distribution. Can you help what value is used in python?

fasiha commented 4 years ago

Right, lambda there is the same as the argument to expo/expovariate in this repo.

I don't know what the right value for lambda is 😞 in the API, it's called q and I have some notes on how to mayyybe pick it here but I haven't used the algorithm so I don't know how to pick this value.

I may have said this before, but I'm not really sure whether the drawbacks of Memorize ((1) having to pick q, and (2) random intervals) are worth the trouble, compared to the simpler strategy of "quiz when probability of recall gets low"... I'm happy you're looking into it though, maybe I'm wrong and maybe the advantages are worth it.