ankidroid / Anki-Android

AnkiDroid: Anki flashcards on Android. Your secret trick to achieve superhuman information retention.
GNU General Public License v3.0
8.41k stars 2.19k forks source link

Forecast Statistic not predictive #3619

Closed kcroker closed 8 years ago

kcroker commented 9 years ago

As it currently stands, the Forecast statistic does not propogate previous performance trends when determining hypothetical future card loads. E.g. if a user on average, and with small deviation, pushes 60% of the daily load at Hard, 20% at Good, and 10% Easy, and 10% Again; then forecast can easily intelligently anticipate card loads. At present, it does not seem to do this at all, nor does it display the anticipated new daily alottment, and so Forecast is not useful at all for determining effective review-only periods over which one can reduce the daily card load.

gregreen commented 9 years ago

There's a great explanation here about how the Anki forecast could be modified to be a real forecast. Right now, it just tells you the next time that each card will appear, without taking into account that cards will be rescheduled, and therefore potentially appear multiple times over any given timespan. If I understood the Anki source code better, I would take a shot at adding a real forecast myself. Maybe if someone else is willing to work with me on it, we can make some progress.

kcroker commented 8 years ago

I wrote it last night. I didn't follow the link you provided, sorry! I just implemented a first order approximation that includes the number of new cards added each day, and uses the target lapse probability to calculate a correction for each day (in the MONTH display), assuming that no new cards will be lapsed and that no already lapsed cards will be relapsed. Also, I assumed (perhaps incorrectly) that the SQL query done in Stats.java automatically has included the recurrences of cards, assuming no lapses. Correcting this assumption seems to be a bit more involved, so I want to see how my first approximation handles in real use.

While this isn't perfect, its orders of magnitude more accurate as far as a FORECAST is concerned :D As soon as I get the Android build system working, figure out how to export my Anki library and import it into an emulator, and make sure the mod is working, I will issue a Pull request.

hssm commented 8 years ago

This graph sounds useful and I'd encourage you to submit an equivalent to the main desktop application.

I'd be willing to add this forecast graph in addition to the existing one. I want to avoid using the same name for something that is actually different from the desktop client because people will wonder why they are getting different results in different clients. I'll leave the name up to you.

kcroker commented 8 years ago

Thanks hssm! :D I just got the build system going (surprisingly painless, but so slow! I miss ant), but I have to figure out all this jarsigner stuff again and get it onto an emulated phone to make sure it does what I think it should. I will dig in a bit to the Stats code and figure out how to add it to another pane. I've got some professional deadlines over the next two weeks though, is it okay if I get this closed up in about 3 weeks?

Does the desktop app also use achartengine? The first order algorithm is very straightforward (10 lines total), but the visualization stuff may take some work, especially because I use Anki solely on my Nexus.

hssm commented 8 years ago

achartengine isn't used anywhere in AnkiDroid. We use a library written by @trashcutter, based on the code here. Hopefully that doesn't prove to be a problem. The desktop application uses something else entirely.

We plan to go into beta in the next few days so you will miss that window, but if it's not a very complex change we might still merge it in if it's still early in the beta.

kcroker commented 8 years ago

I thought it was achartengine because of comments in the Stats.java code (see attached photo). Perhaps this is vestigial screenshot from 2015-10-17 00 34 48 code? In any case, you are a good motivator: I will try to get the Pull Request assembled over the next 48 hours to make the Beta window!

hssm commented 8 years ago

Yeah it must be. We used achartengine before replacing it with the current one.

kcroker commented 8 years ago

@hssm adjustment is complete, please see attached screencap.

HARDCODES: The following are hardcoded values because I am not familiar with the resources files of your project, and I could not readily find which preference keys store these values. If you could provide them to me, I would really appreciate it as it would save me much time, and I can go ahead and soften.

I have nearly exhausted my vault of free time getting the damn emulator working on my janky distro. screenshot from 2015-10-17 10 40 20

gregreen commented 8 years ago

@kcroker I'd be interested in seeing your forecasting code. Is there a commit I can look at? Thanks for taking the time to write this, by the way!

kcroker commented 8 years ago

@gregreen Sure. I think I did this right... ^-^ I forked Anki, and then pushed my commit to my fork: https://github.com/kcroker/Anki-Android/commit/97701e17fbbd7fde7dbda4c1389fc1ea37d2c30a

kcroker commented 8 years ago

@gregreen @hssm I should note that you can easily add relapsing by changing one line, and fine-grained propagation (good, hard, easy) can be added by having 3 separate younguns arrays, propagating each separately on its own time constant (pulled from prefs), and then summing at the end. Relapses works very well for the Month view, as the smallest unit of time is the day. However, this will break the year and deck life views (because you keep incrementing the same chunk incorrectly. This is presently the case too with no relapses, but the effect just flattens out, which is not unreasonable). IMO, the year and deck life views are of very limited use in Forecast anyway.

kcroker commented 8 years ago

I went ahead and added the relapsing because it really works well, also fixed a minor bug. The Year and Deck life views are wrecked, so perhaps they can just be disabled for this pane. I briefly looked into this, but it wasn't immediately clear how to do it through ChartBuilder.java. Anyway, attached is the predictive Forecast picture and my actual personal Review Count history. You can see that the predictive Forecast correctly predicts a slowly rising linear trend (with the correct slope!) and qualitatively gets the "stepping" behaviour correct. If I were to enable the code that tracks actual card views, instead of unique cards reviewed, then it would probably match continuously with the Review Count, which is the behaviour you'd expect from an ideal student.

screenshot from 2015-10-17 22 30 59 screenshot_20151018-112623

hssm commented 8 years ago

The lapse rate is hardcoded at 10% The interval is hardcoded at base 2 New cards is hardcoded at 20

Are you saying that these three values are each a deck option? I know the last one is but I can't figure out specifically what you mean by the other two. If you can tell me the exact name used for these values in the UI I can tell you how to retrieve them in code.

I will add the additional panel myself once you are satisfied with what you have. Just let me know and I'll work off your commit. I think a safe bet is to just blank the graph when choosing the Yearly or Deck Life option.

kcroker commented 8 years ago

@hssm I read that the Anki SRS system uses exponentially longer times between reviews, and this is in general what I've observed. The rate of this increase in time before you see a card again, is determined by the constant in the exponential. If you always press Good, in the default configuration the constant is ln(2): 1,2,4,8,16, etc. I think this is specified by the percentage given in: Deck Options/Reviews/Interval Modifier. Say a card is at 4 days. You get it right and press Good. It will next show up at 4+100% * 4 = 8. So doubled. If its Easy, its 4 + 230% * 4 = 13 (if rounding up). Lapses are placed on an intervals determined by Deck Options/Reviews/Lapses/New interval and Minimum interval. I cannot find the modifier for Hard... Can you confirm that this is how Anki actually computes these things? I haven't dug that deep into the implemented algorithm!

As far as the Lapse Rate, that is engineered into the algorithm somehow. I don't know how, but I recall it being found (by research) that 90% recall success can usually be achieved by a time constant ln(2).

kcroker commented 8 years ago

@hssm Also, thanks for adding the separate pane! What do you think of a title like "Predictive Forecast" ? Also, do you think it should be unique cards predicted, or card views predicted?

hssm commented 8 years ago

I think the value you are after is the card's factor (called its ease in the manual/UI). This value changes depending on how you answer cards so it will be different for each card. You can add a column to the browser (on the desktop) that shows you the % value of this factor, if you are interested.

If you strip away all the other things Anki does while computing an interval, the core equation becomes this:

Hard: interval * 1.2 Good: interval * factor Easy: interval * factor * easy-bonus

As you can see, the factor for hard is hard-coded at 120%. Does this sound like what you need?

I have no clue about the Lapse Rate. I might need to dig through the code to figure that one out.

"Predictive Forecast" was what I had in mind so we will use that. As for your last question I honestly have no clue. This sort of stuff goes over my head. I'll let you guys decide.

kcroker commented 8 years ago

Well: if you want something that's really coherent with "Review Count" then we might want to name the pane "Predicted Count", place it to the right of "Review Count", and have it do predicted card views (because this is what Review Count is actually showing you).

hssm commented 8 years ago

By the way, there is an add-on for the desktop client you might be interested in looking at.

gregreen commented 8 years ago

I think the way to go is to replace the old "Forecast" graph with this one. If we have a "Predictive Forecast," I honestly can't see what the old-style "Forecast" is good for, since it doesn't actually predict anything. It will just be confusing for the users.

Given that, I think we should hold back this commit until it works for the "Year" and "Deck life" graphs as well. For the "Year" view, for example, we could just do a 365-day prediction, and then sum together the totals for each week. For the "Deck life" graph, I'm not sure what we should do. Maybe we should just default to a year for "Deck life"...

I also think we should try to get the interval formula right before merging. It would also make sense to do a query on the user's response history to figure out what their "batting average" is for learning, young and mature cards. According to this and this, the future workload is very sensitive to the percentage of wrong answers. I wrote a simulator in Python and found the same thing.

I'll take a stab at making the above changes as soon as I can get an Android emulator up and running on my laptop.

timrae commented 8 years ago

I think the "proper" way is to bring @dae on board with the changes that you're making, and send a PR to Anki Desktop. If you can get @dae on board then we could merge it into the main statistics code.

I agree with @gregreen that it doesn't make sense to add a whole new pane, so I think if you can't get @dae on board then we could make it an advanced setting in AnkiDroid which is disabled by default. In this case you should make your code into a standalone module (ideally using the "Hook" architecture so that we can make it a plugin in the future), and make a minimum number of edits to libanki to get it to call your module when enabled.

If you can get a consensus and a PR ready within say the next 2 weeks then we can consider merging the code into the 2.5 beta branch, otherwise it will probably be relegated to 2.6 which will likely be released approximately a few months before hell freezes over. :-p

kcroker commented 8 years ago

Wow, super stoked people even care! Thanks guys XD

@gregreen @timrae Though my voice matters little because I just got here, I'd like to say that I'm more inclined to agree with @hssm only because this predictive forecast looks very much like the complement to Review Count, so maybe Predicted Count as I proposed earlier up a bit.

@gregreen Concerning a more rigorous prediction scheme, and having anything useful on Year and Decklife scales, I really think it is not necessary, because

1) Deck life doesn't make sense as its a backward time measurement.
2) Year predictions are not really what you ever want, because any projection that far into the future is going to be so swamped by systematic uncertainty without meticulous engineering of the algorithm.

The previously linked plugin gives very detailed information about specific cards, but again, this is not the information that was relevant to me. I wanted to get an idea what my future daily card load would be so I could schedule my reviews around my IRL life and also get an idea how long it would take, of just reviewing, to get my daily card load down to something manageable. It didn't need to be exact, but the existing Forecast is useless for this. I suspect my need is in line with that of many users, what do you guys think?

So, as far as trying to add super refined behaviour concerning the individual card "factor", I feel this may also be misguided: what you want is a prediction, so accurate to within say 10-20% over 31 days. I could go through and construct an average "factor" and use this to determine the time constant. I suspect in my deck, the average factor is very close to 1.

kcroker commented 8 years ago

@timrae If you really want it refined, here is the way I see following @gregreen

1) Figure out the % responses of hard, good, easy as applied to Mature and Young (so 6 figures) 2) propagate separately the appropriate hard/good/easy fractions of Mature and Young according to the average ease.

I feel at this level though, you're going to be rounding up a lot from fractions. Again, I feel this may not really be all that necessary for the purpose

timrae commented 8 years ago

@timrae If you really want it refined

I don't really have any opinion on the refinements tbh as I rarely look at the forecast, I just think that a) having two forecasts (one of which isn't in Anki desktop) could be a bit confusing, and b) it would be good to hear @dae's opinion, i.e. whether or not he'd consider including these changes in Anki Desktop.

gregreen commented 8 years ago

@kcroker I don't want to shoot your idea down, and I really appreciate that you've made the forecast statistic useful! If we don't get a more advanced forecast implemented by the deadline, then I think we should go with what you've already written.

I also strongly support replacing the current forecast statistic, because I think a predictive forecast is strictly better than a non-predictive forecast (which is really an oxymoron).

I do think, however, that we should increase the forecast time when people select "Year" or "Deck life," simply because it's the behavior I'd expect, and it doesn't seem like much of a change to implement. I'd propose:

1) Month view: each bar in the graph represents one day 2) Year view: each bar represents one week (as in the other statistics panels) 3) Deck life: each bar represents one month (seems like the natural progression)

I actually think that longer-term forecasts are useful. I have a deck, for example, that I've been studying and adding to for months, and I'd like to know how my review burden will decrease over the long term.

kcroker commented 8 years ago

@gregreen then I need to alter the SQL query to not do chunking, as propagations only work when done atomically (on the day). I will do this on Friday. I will also fix the hardcode number of new cards, and explore working out the "batting averages" and average deck ease. In practice, I'd expect the "ease" of cards to be rather peaked in three separate regions corresponding to hard, good, and easy. If you had a moment before Friday and a SQL command line, could you see if this is indeed the case? I can then use your SQL queries to determine time constants.

gregreen commented 8 years ago

@kcroker Here are the queries to get the response probabilities for each type of card:

New

select
  count() as N,
  sum(case when ease=1 then 1 else 0 end) as repeat,
  sum(case when ease=2 then 1 else 0 end) as good,
  sum(case when ease=3 then 1 else 0 end) as easy
from revlog
where type=0

Young

select
  count() as N,
  sum(case when ease=1 then 1 else 0 end) as repeat,
  sum(case when ease=2 then 1 else 0 end) as hard,
  sum(case when ease=3 then 1 else 0 end) as good,
  sum(case when ease=4 then 1 else 0 end) as easy
from revlog
where type=1 and lastIvl < 21

Mature

select
  count() as N,
  sum(case when ease=1 then 1 else 0 end) as repeat,
  sum(case when ease=2 then 1 else 0 end) as hard,
  sum(case when ease=3 then 1 else 0 end) as good,
  sum(case when ease=4 then 1 else 0 end) as easy
from revlog
where type=1 and lastIvl >= 21

Those queries give you the raw numbers of responses for each type of card. Of course, division by N in each case gives the percentage probability.

There's one complication that I didn't think of before: factor. Each card has a "factor" that determines the new interval when "good" is pressed (@hssm gave the equations above). But looking through my collection, I noticed that "factor" varies from about 1.3 to 2.8, and is highly correlated with how many times a card has lapsed. Obviously, factor is being updated with each review, in order to achieve a target lapse rate for each card. I think it would be too much to simulate this, but we could use the current "factor" for each card, and just assume it remains constant throughout the forecast.

I think the best way to approach this is really to just do a simulation. In Pythonic pseudo-code, the algorithm would be

# Time bins for "Month" view. Change for "Year" or "Deck Life"
n_time_bins = 30
time_bin_length = 1
t_max = n_time_bins * time_bin_length

# Forecasted number of reviews
#   0 = Learn
#   1 = Young
#   2 = Mature
#   3 = Relearn
n_reviews = np.zeros((4, n_time_bins))

for card in cards:
    # Initiate time to next due date of card
    t_elapsed = card.due

    c_ivl = card.ivl    # card interval
    c_type = card.type  # 0=new, 1=learn, 2=mature

    # Simulate reviews
    while t_elapsed < t_max:
        # Update the forecasted number of reviews
        n_reviews[rev_type, int(t_elapsed / time_bin_length)] += 1

        # Simulate response
        c_type, c_ivl, correct = sim_single_review(c_type, c_ivl, card.factor)

        # If card failed, update "relearn" count
        n_reviews[3, int(t_elapsed / time_bin_length)] += 1

        # Advance time to next review
        t_elapsed += c_ivl

The sim_single_review function just simulates one review of the card, based on the response probabilities from the SQL queries given above. It returns a new card type (new, learn or mature) and interval, and also says whether the user got the card correct, so that the "relearn" count can be updated.

I think simulating the reviews is the best method, because it's fairly straightforward, and allows one to tweak the logic of how cards are updated easily. It also makes changing the forecast length trivial. If you want to smooth out the forecast history, so that it's less stochastic, you can just run the simulation N times and divide the review totals by N at the end.

gregreen commented 8 years ago

One thing I left out in my simulation pseudo-code was how to limit the number of new cards introduced each day. This could be handled by staggering the initial t_elapsed for the new cards.

gregreen commented 8 years ago

I've created a Gist that shows roughly how a simulation could work (in Python): https://gist.github.com/gregreen/934e3cf8b733f23f7b32

kcroker commented 8 years ago

Wow! You've contributed a lot, much in line with the earlier blog post.

About the queries, I was asking for the factor, which I believe someone said was synonymous with the ease xD I guess they are not!

I'll need a bit to review what you have written before I can comment, but it looks O(N) [cards] where as what I have written is O(1) [days in fixed interval]. These runtime concerns may not be an issue, but since handheld devices tend to be slower, and people tend to generate large decks, I wanted to be a bit conservative. On Oct 23, 2015 1:48 AM, "Gregory Green" notifications@github.com wrote:

@kcroker https://github.com/kcroker Here are the queries to get the response probabilities for each type of card: New

select count() as N, sum(case when ease=1 then 1 else 0 end) as repeat, sum(case when ease=2 then 1 else 0 end) as good, sum(case when ease=3 then 1 else 0 end) as easy from revlog where type=0

Learning

select count() as N, sum(case when ease=1 then 1 else 0 end) as repeat, sum(case when ease=2 then 1 else 0 end) as hard, sum(case when ease=3 then 1 else 0 end) as good, sum(case when ease=4 then 1 else 0 end) as easy from revlog where type=1 and lastIvl < 21

Mature

select count() as N, sum(case when ease=1 then 1 else 0 end) as repeat, sum(case when ease=2 then 1 else 0 end) as hard, sum(case when ease=3 then 1 else 0 end) as good, sum(case when ease=4 then 1 else 0 end) as easy from revlog where type=1 and lastIvl >= 21

Those queries give you the raw numbers of responses for each type of card. Of course, division by N in each case gives the percentage probability.

There's one complication that I didn't think of before: factor. Each card has a "factor" that determines the new interval when "good" is pressed (@hssm https://github.com/hssm gave the equations above). But looking through my collection, I noticed that "factor" varies from about 1.3 to 2.8, and is highly correlated with how many times a card has lapsed. Obviously, factor is being updated with each review, in order to achieve a target lapse rate for each card. I think it would be too much to simulate this, but we could use the current "factor" for each card, and just assume it remains constant throughout the forecast.

I think the best way to approach this is really to just do a simulation. In Pythonic pseudo-code, the algorithm would be

Time bins for "Month" view. Change for "Year" or "Deck Life"

n_time_bins = 30 time_bin_length = 1 t_max = n_time_bins * time_bin_length

Forecasted number of reviews

0 = Learn

1 = Young

2 = Mature

3 = Relearn

n_reviews = np.zeros((4, n_time_bins))

for card in cards:

Initiate time at next due date of card

t_elapsed = card.due
c_ivl = card.ivl    # card interval
c_type = card.type  # 0=new, 1=learn, 2=mature

# Simulate reviews
while t_elapsed < t_max:
    # Update the forecasted number of reviews
    n_reviews[rev_type, int(t_elapsed / time_bin_length)] += 1

    # Simulate response
    c_type, c_ivl, correct = sim_single_review(c_type, c_ivl, card.factor)

    # If card failed, update "relearn" count
    n_reviews[3, int(t_elapsed / time_bin_length)] += 1

    # Advance time to next review
    t_elapsed += c_ivl

The sim_single_review function just simulates one review of the card, based on the response probabilities from the SQL queries given above. It returns a new card type (new, learn or mature) and interval, and also says whether the user got the card correct, so that the "relearn" count can be updated.

I think simulating the reviews is the best method, because it's fairly straightforward, and allows one to tweak the logic of how cards are updated easily. It also makes changing the forecast length trivial. If you want to smooth out the forecast history, so that it's less stochastic, you can just run the simulation N times and divide the review totals by N at the end.

— Reply to this email directly or view it on GitHub https://github.com/ankidroid/Anki-Android/issues/3619#issuecomment-150377364 .

gregreen commented 8 years ago

I think the simulation method scales as O(N sqrt T), where N is the number of cards, and T is the length of time of the forecast. On the 9-year-old laptop I'm currently working on, the time constant is about 7.5 microseconds, so a 365-day forecast for 1000 cards would take 140 milliseconds. I suspect that most modern phones are faster than this laptop. I'm also running naive Python code, and I suspect that Java running on a Dalvik VM will be better optimized. Basically, I doubt time will be a problem in practice.

kcroker commented 8 years ago

O(N) = O(N*k) for any fixed k > 0 by defn, right? It is deck length that is the unconstrained variable across runs, not T. But anyway, that stuff is silly, I'm glad you got some bench on your simulation:

Take the dominant Heisig deck of 3000 cards. Now run your simulation 10 times to generate the averages you wanted. This gives a time of ~4.2 seconds, which I feel is troublesome. Hardware from 9 years ago is maybe a factor of 2 off in clockspeed from modern stuff, so apart from cache advantages (which could be substantial...) that's 1-2 seconds to make the forecast.

But anyway, I'm getting an odd feeling here, so I'm gonna be naive and honest in the hopes that things stay peachy? You seem to have put in a lot of work here to write your simulation code, which makes me feel like an ass for not really wanting to proceed with the implementation in that way. I'm sorry man, just my disposition at present. I intend no insult and bear no malice!

AnkiDroid is an amazing project and has been an indispensable part of my life, every day, since June of this year. It has made my dream of doing ethnographies of Japanese artists one step closer, and I am truly grateful all of you have dedicated the time that you have. Since this project is your guys' turf, and I'm not here to cause trouble, perhaps it is best I use my own fork on my own device. On Oct 23, 2015 11:40 PM, "Gregory Green" notifications@github.com wrote:

I think the simulation method scales as O(N sqrt T), where N is the number of cards, and T is the length of time of the forecast. On the 9-year-old laptop I'm currently working on, the time constant is about 7.5 microseconds, so a 365-day forecast for 1000 cards would take 140 milliseconds. I suspect that most modern phones are faster than this laptop. I'm also running naive Python code, and I suspect that Java running on a Dalvik VM will be better optimized. Basically, I doubt time will be a problem in practice.

— Reply to this email directly or view it on GitHub https://github.com/ankidroid/Anki-Android/issues/3619#issuecomment-150683886 .

gregreen commented 8 years ago

Both N and T can vary separately, since decks can have different numbers of cards, and we might give forecasts for 1 month, 1 year, or deck life (whatever time we choose to use for that screen). I actually don't think we need to do 10 separate simulations to smooth. I think one simulation is fine. I really think the simulation runtime will be well below a second on most phones, and completely unnoticeable on any modern laptop. A 3000-card deck would take 120 milliseconds to run a 1-month simulation on my extremely old laptop. A 1-year simulation would take 430 milliseconds.

But if you really want to do the other type of forecast, that's fine. I'm really sorry I can't directly contribute to the AnkiDroid codebase right now (no Android SDK set up yet, and no experience with Java).

dae commented 8 years ago

Bear in mind that some users have revlogs over 500k records or more and hundreds of thousands of cards, so any performance losses need to be thought about with that in mind - it may make sense to keep the simple forecast and have a link or button to get more a more detailed but slower one. If you're keen to get this in the computer version I'd recommend doing it in an add-on for now, and we can see how popular it becomes.

timrae commented 8 years ago

@dae Thanks for the input! A button to manually trigger the slower calculation sounds like a good potential compromise. IIRC all the statistics are calculated at the same time when the statistics window is first opened, so 1 second would be quite long to have by default IMO.

kcroker commented 8 years ago

@dae @timrae Very glad to hear people are still interested and sympathetic to the design considerations of the quick and already integrated approach. Ultimately, we will receive feedback from the users that "new forecast is awesome!" or "new forecast sucks and is never right." I ultimately feel this feedback should dictate future algorithmic directions. After all, Wu-Tang is for the children ;)

@dae I'm detached from the computer version, as I don't use it at all. The algorithm, apart from the specifics on rendering the graph, is decoupled so it would not be difficult at all to transfer.

@timrae of course I have no issues with additional statistics, but I should make clear that I have neither the time or inclination to implement the proposed monte-carlo approach in Java. There is just so much variability over longer scales: missing days, taking a break on learning new content, etc. that cannot ever be predicted because it depends on the user.

All this being said, thank you both for your interest and encouragement! I will wrap things up within the next 3 days and issue the pull request!

timrae commented 8 years ago

@kcroker

After seeing @hssm and @dae's responses, my recommendation (I'm the project leader, but I'm not intending to saying this as the "final word" or anything) is to introduce a new AnkiDroid CheckboxPreference (e.g. in Settings > Advanced > Plugins > Advanced statistics engine) and let the user switch between the libanki algorithm and your algorithm.

I strongly encourage you to make your code as self-contained as possible, i.e. in it's own class with just a few lines modification in libanki. Ideally you would make a new hook (you could use the other hooks as an example e.g. the leech hook), which is installed when the setting is enabled, and overrides the standard libanki method. This will make it easier for us to maintain the stats code, and will also ensure that it can be easily made into a plugin if we ever get the plugins architecture working. If you don't have time to figure out the hook thing, just refactoring as much as possible into a new class would be great.

@gregreen could then add his algorithm as a third alternative at a later point if he wishes

kcroker commented 8 years ago

@timrae sounds like a plan. It might take me a few extra days to get fluent with your hook approach, but I'll make it happen

jvanprehn commented 8 years ago

Keeping track of the number of cards per state at each time instant in gregreen's code gives the following:

anki_forecast

To generate the screenshot I also 'simulated' the past using the review log so that past and future can be displayed in one chart.

Is a chart like this considered useful? (it's actually just the pie chart which already exists in Anki + time dimension).

kcroker commented 8 years ago

That graph cannot be correct, given the existing color definitions. The new learns certainly should not be changing (your red, usual blue). Could you please remove any card caps (limits) and adjust the colors to be consistent with their current definition in anki for easier comparison please?

On Sun, Dec 27, 2015 at 8:56 PM, jvanprehn notifications@github.com wrote:

Keeping track of the number of cards per state at each time instant in gregreen's code gives the following:

[image: anki_forecast] https://cloud.githubusercontent.com/assets/16454627/12011555/e0510a3a-acd2-11e5-8a55-e143cf3bddf8.png

To generate the screenshot I also 'simulated' the past using the review log so that past and future can be displayed in one chart.

Is a chart like this considered useful? (it's actually just the pie chart which already exists in Anki + time dimension).

— Reply to this email directly or view it on GitHub https://github.com/ankidroid/Anki-Android/issues/3619#issuecomment-167435838 .

jvanprehn commented 8 years ago

Using correct color definitions and a correct label (unseen instead of new) we get this: anki_forecast_correct_colors I already removed limits in my previous post.

Please note that this graph plots the history and future of the pie chart.

Plotting the predicted number of reviews gives the following plot (yes, the one learning this deck hasn't learned for a couple of days and is now making up for it. because I removed the max_reviews_per_day limit, we predict that she'll make up for all of it tomorrow):

anki_predicted_n_reviews

But that work has already been done by gregreen (in python).

My version of gregreen's code in Java is here: https://github.com/jvanprehn/AnkiStatistics

(Path, deck creation timestamp etc. are in Settings.java)

I think that my usecase was different from yours:

  1. You want to predict the number of reviews per day to anticipate card loads
  2. I want to predict the number of mature/learning/new cards

These two use-cases have in common that they can be implemented using almost the same code.

While coding it I realized that you can also modify it to also include the past (instead of predicting we look in the review log what the actual outcome of a card review was).

I was wondering in my previous post (and still am) if a chart like that (including the past) is considered useful.

If so I (or we) can spend some time to make it more efficient (simulating the entire past per card is not efficient since we can just count the number of cards per state at the beginning and end of, say, every month).

I see that this issue has been given the 'accepted' label. Does this mean that you are getting the hook approach working? Then we can consider adding this monte-carlo approach as well (next to your one-step-lookahead which might even be a special case of this monte-carlo approach but I didn't check if it actually is). My code actually (almost) contains a generilization of the monte-carlo approach in which we can do a complete traversal of the future-tree for small decks with a small interval to be predicted (see ReviewSimulator.java).

kcroker commented 8 years ago

@jvanprehn Okay, I see what you've done now. This actually looks really good, thank you for putting in the effort to translate from python.

I suspected that this approach would give results more or less equivalent to what you're calling the "one-step lookahead." Qualitatively, it looks the same, and I wouldn't be surprised if higher lookaheads remain bounded around ~10% of the first order prediction.

If runtimes are not an issue in practice, even with enormous decks of 10,000 cards (as @dae pointed out), then I cannot object further to the @gregreen approach. My primary complaint was that it was not implemented, but you fixed that for him ;)

As far as the hooks, no. I got swamped in work obligations, and have not worked any further beyond my original fork for proof of concept. I think that hooking it is not really necessary and you should just drop the code in there (see above discussion for other perspectives) since the current code doesn't seem useful to anyone and if this approach doesn't grind on old phones (please try it with non-optimal hardware), or perhaps given an adjustable look-ahead so, this would be the way to do it.

As far as viewing the past, doesn't the Review Count tab already do this? My thought (see above) was just to have the Forecast tab to the right of Review Count, so then its t < 0 to the left and t > 0 to the right.

Again, thanks for taking initiative!

On Mon, Dec 28, 2015 at 9:37 PM, jvanprehn notifications@github.com wrote:

Using correct color definitions and a correct label (unseen instead of new) we get this: [image: anki_forecast_correct_colors] https://cloud.githubusercontent.com/assets/16454627/12023960/443a49f4-ad9f-11e5-8594-3fc4e2be9661.png I already removed limits in my previous post.

Please note that this graph plots the history and future of the pie chart.

Plotting the predicted number of reviews gives the following plot (yes, the one learning this deck hasn't learned for a couple of days and is now making up for it. because I removed the max_reviews_per_day limit, we predict that she'll make up for all of it tomorrow):

[image: anki_predicted_n_reviews] https://cloud.githubusercontent.com/assets/16454627/12024001/c475bdba-ad9f-11e5-9d0b-387c72d6546b.png

But that work has already been done by gregreen (in python).

My version of gregreen's code in Java is here: https://github.com/jvanprehn/AnkiStatistics

(Path, deck creation timestamp etc. are in Settings.java)

I think that my usecase was different from yours:

  1. You want to predict the number of reviews per day to anticipate card loads
  2. I want to predict the number of mature/learning/new cards

These two use-cases have in common that they can be implemented using almost the same code.

While coding it I realized that you can also modify it to also include the past (instead of predicting we look in the review log what the actual outcome of a card review was).

I was wondering in my previous post (and still am) if a chart like that (including the past) is considered useful.

If so I (or we) can spend some time to make it more efficient (simulating the entire past per card is not efficient since we can just count the number of cards per state at the beginning and end of, say, every month).

I see that this issue has been given the 'accepted' label. Does this mean that you are getting the hook approach working? Then we can consider adding this monte-carlo approach as well (next to your one-step-lookahead which might even be a special case of this monte-carlo approach but I didn't check if it actually is). My code actually (almost) contains a generilization of the monte-carlo approach in which we can do a complete traversal of the future-tree for small decks with a small interval to be predicted (see ReviewSimulator.java).

— Reply to this email directly or view it on GitHub https://github.com/ankidroid/Anki-Android/issues/3619#issuecomment-167636096 .

jvanprehn commented 8 years ago

I think that your suspicion is right that this approach will give results more or less equivalent to what I was calling the "one-step lookahead." We can try that later.

Where can I get huge decks (~10,000 cards) with huge review logs (~500k records)? I can find huge decks. But I want huge decks with huge review logs... Just to see if runtimes are not an issue in practice.

I think we have to look into hooks or some alternative so that we can at least switch off the simulation or adjust the lookahead or adjust the number of simulations.

Yes the Review Count tab already does show the past. I was too focused on the beauty (genericity) of gregreen's algorithm. I would just plot the cumulative lines differently. Like the bars in my previous post. So one line for unseen. One for young + unseen. One for mature + young + unseen. But that's maybe something for later.

My todo list for now (from high to low priority):

My dreams for the future (maybe will be done, maybe not):

jvanprehn commented 8 years ago

anki_forecast_android

Code is here: https://github.com/jvanprehn/Anki-Android/commit/bffcb62c9d5262f56f3d6856db4453b81c265ae2

Additonal TODO items before issuing pull request:

Questions:

jvanprehn commented 8 years ago

Is it okay if settings go here? I put them below plugins, not nested in plugins as the advanced statistics are a set of preferences and not only one (on/off). If we put them nested in plugins, we will have plugins -> advanced statistics on/off -> advanced statistics settings (3 deep). Now it is just one preference depending on advanced statistics, but there will be more...

anki_settings

timrae commented 8 years ago

It's better if you do it as a subsub screen like I did with the custom sync server preference in the latest development branch

jvanprehn commented 8 years ago

@timrae Thanks for pointing me to an example on how to do it. It now looks like this: anki_settings_advanced anki_settings_subsub

timrae commented 8 years ago

@jvanprehn

Thanks for pointing me to an example on how to do it. It now looks like ...

Looks good!

Is it okay to put the classes as inner classes in Stats.java or should they go elsewhere?

As I previously mentioned you should put all of this code into a self-contained module. The modifications to libanki should be pretty minimal.

Using a hook is not a merge requirement or anything, but I'd like to encourage you to do it that way, as that should make it easy to port it to the AnkiDroid plugin architecture when it's eventually written (not to mention portability to an Anki desktop plugin), so should prevent it from just getting stripped out in the future with no plugin to replace it.

jvanprehn commented 8 years ago

I see the following possibilities:

  1. Make it into a hook and modify com.ichi2.libanki.Stats slightly to support the hook.
  2. Subclass com.ichi2.libanki.Stats and use the subclass in com.ichi2.anki.stats.ChartBuilder.calcStats. In the subclass do the same as in (1).

Which one is preferred? Or is there a third one? Of course we can leave out the hook and do it all in the subclass, but if I get the hook working, I'll use it...

timrae commented 8 years ago

At first I was thinking subclassing was a good idea, but after a little bit of thought I don't think that would be a great design in terms of encapsulation (i.e. we want to avoid changes in libanki breaking your code). I think making small modifications to libanki which outsources the calculations to your plugin module is the way to go. Ideally that would take the form of a hook (which I don't think would be too hard to do), but that's not strictly necessary.

jvanprehn commented 8 years ago

I got the hook working. I only needed small changes to libanki/Stats.java. Code is here: https://github.com/jvanprehn/Anki-Android/commit/f9e4c9d51b5db17cf0da23842a45f3417d984953. I am not completely happy with it because I introduced a new object (StatsMetaInfo) which duplicates lots of fields from stats.java in order to pass fields from the hook to stats.java. I was thinking about putting getters and setters in stats.java and pass stats.java to the hook to prevent duplicating all those fields. But that would have the same drawback as subclassing mentioned by you (changes to libanki break my code).

The problem is really that either x has to speak the language of y or y has to speak the language of x (x being stats.java and y being the hook). Now I chose to have y speak the language of x. So if x changes, y has to change. If we don't, it will break. I think that problem exists with subclasses and also with hooks.

Do you have other ideas? Or is this just fine?