kerrickstaley / genanki

A Python 3 library for generating Anki decks
MIT License
2.06k stars 161 forks source link

No way to load/modify existing packages #66

Closed AlexApps99 closed 3 years ago

AlexApps99 commented 3 years ago

I have grown dissatisfied with a shared deck I am using, and want to make some changes to it programatically. I want to do this while maintaining the reviews/data/progress etc

I assume I could make a new deck and overwrite the cards if I keep the GUIDs the same, but there are over 2000 cards in this deck so it would be impractical to read these GUIDs manually.

If it's too much work to add read functionality to this library, I would appreciate some advice on how to extract an ordered list of GUIDs from within Anki, so I can use genanki without reading the old set of cards.

Thanks

remiberthoz commented 3 years ago

If you know some SQL, you can open the apkg file as a zip. Inside you will find a sqlite database with all the data for this deck. The database is not intuitive to navigate, but that would get you started quickly.

AlexApps99 commented 3 years ago

I know a tiny bit about SQL, but have never used SQL from Python. Please can you provide me with a relevant section of code within Anki, or some docs/example on how to open and query the apkg? Thanks

Also, if this library can save as an SQL database, there's no reason for it to not to go in the other direction, so this is a feature that could be worth implementing

remiberthoz commented 3 years ago

Sure!

You will need python's sqlite3 module. I suggest that you manually extract the database from the apkg, for simplicity when getting started (maybe you can automate it later).

Rename your .apkg as a .zip file, and extract it somewhere. Find a file called collection.anki2, rename it to database.sqlite, and place it somewhere easy to access from your python code.

Then in python:

# Load the sqlite module and create a cursor to the database
import sqlite
conn = sqlite3.connect('database.sqlite')
c = conn.cursor()

You now have access to the database from this python cursor (c).

The collection database is organised in several tables. The one you are interested in is named notes, it contains multiple columns: of interest to you are id (note ID), guid (note GUID), flds (note fields, i.e. the displayed text), sfld (note sorting field, i.e. a value with which the notes are sorted in anki).

You can query the table for this fields with:

# Query columns of interest and print each row
# You can query all columns using `*` in place of field names
for row in c.execute('SELECT id, guid, flds, sfld FROM notes ORDER BY sfld):
        # Each `row` will be a tuple with as many entries as field queried
        print(row)
        id = row[0]
        guid = row[1]
        flds = row[2]
        sfld = row[3]
        # (You cannot manipulate the database directly here)

You can't modify the database directly with this SELECT query. To do so you would need to UPDATE. But I think the cleanest thing would be to extract data of interest into a file (i.e CSV, JSON, Yaml) and then use another script with genanki to create a new deck.

More on the SQLite3 python module: https://docs.python.org/3/library/sqlite3.html.

Now about implementing a read feature within genanki, that would be for @kerrickstaley to decide. It's a convenient feature but I'm not sure it's the purpose of this tool.

AlexApps99 commented 3 years ago

I managed to achieve my goal, so thanks for the help. I think I'll keep this issue open until Kerrick responds, as there is definitely merit in creating this feature

kerrickstaley commented 3 years ago

Thanks @remiberthoz for the super-detailed help!

The ability to read existing .apkg files is not on the roadmap for genanki in the near future. I think it can be done, but would be a decent amount of work and I would worry about issues like certain fields getting corrupted / mishandled in the translation from .apkg to Python object back to .apkg (for example, there are already certain LaTeX-related fields that genanki does not handle but instead just hardcodes defaults; we would need to fix support for these fields first if we want to correctly propagate them out of an .apkg that sets them).

I do agree that this feature could be worth doing. If someone wants to implement it I will work with them to get it merged, or if I have a windfall of free time I may implement it myself. It's not a huge task but needs to be properly implemented, tested, documented, and have corner cases worked out.

Closing this issue for now to keep the tracker tidy but will re-open if someone is actively working on it.