pwintz / beyondki

Prerequisite sorting for Anki decks
0 stars 0 forks source link

Sort Cards Based on Prerequisites and a Secondary Sorting Criteria #3

Closed pwintz closed 1 year ago

pwintz commented 2 years ago

Problem Statement

We need to determine the order that each card is introduced based on the prerequisite dependencies listed in tags combined with either the creation time or the new card order.

We need a strict total order to sort the cards. The prerequisites induce a strict partial order on the notes. If card A is a prerequisite of card B, then A and B are comparable and we could say A < B (meaning A must be introduced before B). Additionally, the new card order provides a strict total order on Anki cards where (by default), for every pair of cards A and B, if A was created before B, then A < B.

The question is how to create a new strict total order that

  1. agrees with the strict partial order induced by the prerequisites whenever a pair of cards are comparable.
  2. agrees with the new card order "as much as possible" (we need to determine what this means).

The graph of prerequisites between notes consists of one or more multitrees.

Example Prerequisite Graph

The following image shows an example of how cards can be related to each other via prerequisites. The tags for each card are listed below it. image

For this example, the strict partial order on the cards is as follows,

For each card that does not have a card less than it, we say that it is a minimal card. Thus, for this example, A, B, C, D, E, and I are minimal cards.

(Possible) Sorting Algorithm

  1. Create a note graph, a tag dictionary, a prerequisite dictionary, and a card queue. The note graph is shown above. In the tag dictionary, each key is a tag and the corresponding value is a list of note ids with that tag. In the prerequisite dictionary, each key is a tag and the corresponding value is a list of note ids that have that tag as a prerequisite. The card queue is used to record the order cards that will be introduced.
  2. Find all of the minimal notes in the graph.
  3. Among the set of cards on the minimal notes, find the minimum card (the card with the lowest new card order).
  4. Add the minimum card to the card stack and update the graph and dictionaries to remove that card (what this means is TBD)
  5. Return to step 2.

In a previous comment, I described the following process, so I'm going to copy it here for future reference

To that end, the way I want to choose the ordering is as follows. Start with all of the cards in the collection partitioned into two categories: "Free" contains every card that does not have any prerequisites and "Blocked" contains every card that has a prerequisite in the Free or Blocked cards.

  1. Among the Free cards, pick the card with the earliest creation time.
  2. Use forget to put that card at the end of the new card queue and remove it from the Free cards.
  3. Update the list of Blocked cards by moving each card that does not have any prerequisites in the Free or Blocked categories into the Free category.
  4. Repeat until the Free category is empty.

This will cause cards to be introduced in the order that users created them except when necessary to satisfy prerequisites. If a user wants to learn cards depth-first, then they can create cards in that order and the same for breadth-first.

langfield commented 2 years ago

Well a graph is usually just represented as an adjacency list, and in most languages it actually makes more sense to have it be a mapping. And your tag dictionary may not be necessary. You can just use a thin, typed wrapper around col.find_notes(). Also probably better to use a List rather than a queue for your card queue, since you might need inserts. Not sure I understand the algorithm well enough to be sure about that. I'll have to think about it some more.

So you might have the following for type signatures:

Here we define type aliases Guid = str and CardId = int.

pwintz commented 1 year ago

My previous algorithm was too confusing, so I've come up with one that should be easier to implement:

I think that the iteration in this algorithm will run in O(mn) time for m cards with a maximum of n prerequisites per card. Assuming each card has only a few prerequisites, this will scale well.

langfield commented 1 year ago

Let $i$ denote what you're calling requirement_graph[i].key. We find each $k$ such that $i$ is a prerequisite of $k$. We remove $i$ from the prerequisites of $k$. We then recursively call this procedure on each $k$.

Note that the dependencies, (your enablement_graph) do not change at all during this process. Is that right?

langfield commented 1 year ago

I think this basically does what you described.

"""PoC prerequisite sorting."""
from typing import List, FrozenSet, Mapping, Tuple, Iterable
from functools import reduce, partial
from dataclasses import dataclass

from immutables import Map

Cid = int

@dataclass(frozen=True, eq=True)
class Card:
    """A orderable card type."""
    cid: Cid
    ord: int
    due: int

Queue = List[Card]
CardMap = Mapping[Card, FrozenSet[Card]]

def prune(
    deps: CardMap,
    state: Tuple[CardMap, Queue],
    i: Card,
) -> Tuple[CardMap, Queue]:
    """Prune dependencies of `i`."""
    reqs, q = state
    if len(reqs[i] > 0):
        return reqs, []

    # Remove `i` from the prerequisites of each dependency `k` and recurse on `k`.
    reqs = reduce(lambda reqs, k: reqs.delete(k).set(k, reqs[k] - {i}), deps[i], reqs)
    prunables: Iterable[Card] = filter(lambda k: k < i, deps[i])
    return reduce(partial(prune, deps), prunables, (reqs, q + [i]))

def main() -> None:
    """Run the program."""
    deps, reqs, cards = Map(), Map(), []
    _, q = reduce(partial(prune, deps), sorted(cards), (reqs, cards))

The time complexity is possibly quite bad, because I've chosen to use an immutable mapping type and frozen sets. I don't think that matters too much, though. Note that the key function for the sorted() call is not yet implemented, but is trivial given the fields we have access to in Card.

pwintz commented 1 year ago

Thanks, that helped me get started! I've pushed my attempt at it here.

langfield commented 1 year ago

Cool stuff, let me know how it works!

In the meantime, big changes in the work for ki. Check out https://github.com/langfield/ki/issues/39 if you're curious!

pwintz commented 1 year ago

Looks like my algorithm works! It sorts the cards and is quite fast unless there are a lot of prerequisites for each card (i.e., the graph is, in a sense, dense). It does run into a RecursionError if the length of a chain of prerequisites is near the Python recursion limit (1000, by default). Maybe this will be a problem someday, but I don't think we need to worry about it for now. For relatively sparse prerequisite graphs, the algorithm can sort a deck of 20,000 cards in a few seconds and a deck of 100,000 cards in a few minutes.

There's a suite of tests in tests/test_CardSorter.py to check that it's correct.

You can see the commit here.