pwintz / beyondki

Prerequisite sorting for Anki decks
0 stars 0 forks source link

Write a Wrapper for Anki Library to Get New Card Order #4

Open pwintz opened 1 year ago

pwintz commented 1 year ago

We need to get the new card order from the anki library to use for sorting. Rather than accessing anki directly from the sorting code, we want a thin wrapper that provides the new car order. This can probably just be a single function that reads the database, although it might be more efficient to read in large batches.

langfield commented 1 year ago

I've written some comments/suggestions below, feel free to ignore them.

I highly recommend you think of the new card ordering as a function rather than a data structure. The anki library has a Card type that has all the information you need for sorting. There's no need to put the Cids in a special list and call it your 'ordering'.

As for a wrapper, you almost certainly don't need one. The functions provided by anki are very high-level and give you everything you need. You'll just be adding boilerplate for no reason, slowing down the program by adding to the call stack, and adding more places for bugs to hide. As for efficiency, again, using anki means you'll be calling backend functions, which are compiled Rust and very fast. Pulling it all into memory in a pure-python wrapper may slow things down drastically.

I've taken a look at your sorting code. There are of course many ways to do this, but I question whether you need a python class for this. You will eliminate several classes of possible bugs if you write pure functions, no classes/self, without any loops. This problem is perfectly amenable to this style of programming, you have an input data structure and you're sorting it according to some heuristic. This is a pure function.

I think the lack of reusability comes in object-oriented languages, not functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.

If you have referentially transparent code, if you have pure functions — all the data comes in its input arguments and everything goes out and leave no state behind — it’s incredibly reusable. -Joe Armstrong on OOP

pwintz commented 1 year ago

I highly recommend you think of the new card ordering as a function rather than a data structure. The anki library has a Card type that has all the information you need for sorting. There's no need to put the Cids in a special list and call it your 'ordering'. As for a wrapper, you almost certainly don't need one. The functions provided by anki are very high-level and give you everything you need.

Just be clear, what I meant by "wrapper" is simply a function that maps each cid to a number used to sort the cards. I imagine it would be something like

def new_card_position(cid) -> int:
   return <read the new card position from the database>

Using a wrapper gives us the freedom to use different functions in tests that don't read from the database, plus we can write our code to interact with the interface that we design instead of being forced to conform to the anki packages API.

As for efficiency, again, using anki means you'll be calling backend functions, which are compiled Rust and very fast. Pulling it all into memory in a pure-python wrapper may slow things down drastically.

If it turns out that reading the sort order is slow, then I'll look into speeding it up, but it's better to start out designing the cleanest code and not worry about premature optimization.

I've taken a look at your sorting code. There are of course many ways to do this, but I question whether you need a python class for this.

I was on the fence about making it a class vs. a function, so I might switch. I'll have to give it more thought.

langfield commented 1 year ago

I highly recommend you think of the new card ordering as a function rather than a data structure. The anki library has a Card type that has all the information you need for sorting. There's no need to put the Cids in a special list and call it your 'ordering'. As for a wrapper, you almost certainly don't need one. The functions provided by anki are very high-level and give you everything you need.

Just be clear, what I meant by "wrapper" is simply a function that maps each cid to a number used to sort the cards. I imagine it would be something like

def new_card_position(cid) -> int:
   return <read the new card position from the database>

Using a wrapper gives us the freedom to use different functions in tests that don't read from the database, plus we can write our code to interact with the interface that we design instead of being forced to conform to the anki packages API.

I don't see the need for your sorting logic to interact with the database at all. No matter what type you use to represent cards, we will just call it Card for the sake argument here, you need only define a function __le__(self, c: Card) -> Bool to get a total order. If for whatever reason your Card type carries around state with it, or a reference to the database, you can simply mock it out for tests. I believe this satisfies both of the constraints you gave: (1) no DB ops during tests, and (2) other logic doesn't depend on the anki API. It has the advantage of being extremely fast compared to something like new_card_position(), which reads from the database and presumably interacts with some container that holds references to all the cards in the collection, and so may be $O(n)$.

pwintz commented 1 year ago

@langfield, I've been looking into how to get the new card positions. Is this something you know how to do?

pwintz commented 1 year ago

I thought that maybe it would be Card.ord or Card.due but the Card class lacks comments, so it's hard to be sure. Maybe CardQueue would have what I need?

langfield commented 1 year ago

Refer to this (slightly outdated) description of the database schema. I think you might just want Card.id, since that's creation date.

There's also some good stuff in the manual.

Reposition

Change the order new cards will appear in. You can find out the existing positions by enabling the due column, as described in the table section above. If you run the reposition command when multiple cards are selected, it will apply increasing numbers to each card in turn. By default the number increases by one for each card, but this can be adjusted by changing the "step" setting. The Shift position of existing cards option allows you to insert cards between currently existing ones, pushing the currently existing ones apart. For instance, if you have five cards and you want to move 3, 4, and 5 between 1 and 2, selecting this setting would cause the cards to end up in the order 1, 3, 4, 5, 2. By contrast, if you turn this option off, 1 and 2 will get the same position number (and it will thus be unpredictable which of the cards with the same number comes up first). Please note that when enabled, any card with a higher position will be modified, and all of those changed cards will need to be sent the next time you sync.

Display Order

The way Anki fetches cards from the decks depends on the algorithm used:

  • With the v1 scheduler, when a deck has subdecks, the cards will appear from each deck in turn.

  • With the v2 scheduler, when a deck has subdecks, reviews are taken from all children decks at once. The review limit of the child decks is ignored - only the limit of the deck you clicked on applies.

  • With the v3 scheduler each child deck's limit is also enforced, and you do not need to see the cards in deck order either. See the deck options section of the manual for more information.

By default, for new cards, Anki fetches cards from the decks in alphabetical order. So in the above example, you would get cards first from “French”, then “My Textbook”, and finally “Vocab”. You can use this to control the order cards appear in, placing high priority cards in decks that appear higher in the list. When computers sort text alphabetically, the “-” character comes before alphabetical characters, and “~” comes after them. So you could call the deck “-Vocab” to make them appear first, and you could call the other deck “~My Textbook” to force it to appear after everything else.

New cards and reviews are fetched separately, and Anki won’t wait until both queues are empty before moving on to the next deck, so it’s possible you’ll be exposed to new cards from one deck while seeing reviews from another deck, or vice versa. If you don’t want this, click directly on the deck you want to study instead of one of the parent decks.

Since cards in learning are somewhat time-critical, they are fetched from all decks at once and shown in the order they are due.

To control the order reviews from a given deck appear in, or change new cards from ordered to random order, please see the deck options. For more fine-grained ordering of new cards, you can change the order in the browser.

So it does look like the due column is significant, but it just stores the nid, which is probably the same as the cid for most cards, since the cards are created at the same time as the notes.

pwintz commented 1 year ago

Thanks! That was helpful. I have a working proof-of-concept now in 99f5032 (see demo.py). I need to clean it up and add tests, but it looks like it sorts the cards according to the tags and creation date, as desired.