Closed AlexApps99 closed 3 years ago
If you know some SQL, you can open the apkg file as a zip. Inside you will find a sqlite database with all the data for this deck. The database is not intuitive to navigate, but that would get you started quickly.
I know a tiny bit about SQL, but have never used SQL from Python. Please can you provide me with a relevant section of code within Anki, or some docs/example on how to open and query the apkg? Thanks
Also, if this library can save as an SQL database, there's no reason for it to not to go in the other direction, so this is a feature that could be worth implementing
Sure!
You will need python's sqlite3
module. I suggest that you manually extract the database from the apkg, for simplicity when getting started (maybe you can automate it later).
Rename your .apkg
as a .zip
file, and extract it somewhere. Find a file called collection.anki2
, rename it to database.sqlite
, and place it somewhere easy to access from your python code.
Then in python:
# Load the sqlite module and create a cursor to the database
import sqlite
conn = sqlite3.connect('database.sqlite')
c = conn.cursor()
You now have access to the database from this python cursor (c
).
The collection database is organised in several tables. The one you are interested in is named notes
, it contains multiple columns: of interest to you are id
(note ID), guid
(note GUID), flds
(note fields, i.e. the displayed text), sfld
(note sorting field, i.e. a value with which the notes are sorted in anki).
You can query the table for this fields with:
# Query columns of interest and print each row
# You can query all columns using `*` in place of field names
for row in c.execute('SELECT id, guid, flds, sfld FROM notes ORDER BY sfld):
# Each `row` will be a tuple with as many entries as field queried
print(row)
id = row[0]
guid = row[1]
flds = row[2]
sfld = row[3]
# (You cannot manipulate the database directly here)
You can't modify the database directly with this SELECT
query. To do so you would need to UPDATE
. But I think the cleanest thing would be to extract data of interest into a file (i.e CSV, JSON, Yaml) and then use another script with genanki to create a new deck.
More on the SQLite3 python module: https://docs.python.org/3/library/sqlite3.html.
Now about implementing a read feature within genanki, that would be for @kerrickstaley to decide. It's a convenient feature but I'm not sure it's the purpose of this tool.
I managed to achieve my goal, so thanks for the help. I think I'll keep this issue open until Kerrick responds, as there is definitely merit in creating this feature
Thanks @remiberthoz for the super-detailed help!
The ability to read existing .apkg
files is not on the roadmap for genanki in the near future. I think it can be done, but would be a decent amount of work and I would worry about issues like certain fields getting corrupted / mishandled in the translation from .apkg
to Python object back to .apkg
(for example, there are already certain LaTeX-related fields that genanki does not handle but instead just hardcodes defaults; we would need to fix support for these fields first if we want to correctly propagate them out of an .apkg
that sets them).
I do agree that this feature could be worth doing. If someone wants to implement it I will work with them to get it merged, or if I have a windfall of free time I may implement it myself. It's not a huge task but needs to be properly implemented, tested, documented, and have corner cases worked out.
Closing this issue for now to keep the tracker tidy but will re-open if someone is actively working on it.
I have grown dissatisfied with a shared deck I am using, and want to make some changes to it programatically. I want to do this while maintaining the reviews/data/progress etc
I assume I could make a new deck and overwrite the cards if I keep the GUIDs the same, but there are over 2000 cards in this deck so it would be impractical to read these GUIDs manually.
If it's too much work to add read functionality to this library, I would appreciate some advice on how to extract an ordered list of GUIDs from within Anki, so I can use genanki without reading the old set of cards.
Thanks