klieret / AnkiPandas

Analyze and manipulate your Anki collection using pandas! ๐ŸŒ ๐Ÿผ
https://ankipandas.rtfd.io/
MIT License
133 stars 18 forks source link

Support opening apkg files out of the box #93

Open Blocked opened 3 years ago

Blocked commented 3 years ago
/data/data/com.termux/files/usr/lib/python3.9/site-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
   1735
   1736             ex = DatabaseError(f"Execution failed on sql '{args[0]}': {exc}")
-> 1737             raise ex from exc
   1738
   1739     @staticmethod

DatabaseError: Execution failed on sql 'SELECT * FROM cards': file is not a database
klieret commented 3 years ago

Hi @Blocked. Thanks for opening the issue :blush:.

I don't really have an answer for your ight away, but this error seems to be relatively unrelated to most of the code from AnkiPandas and should already occur with the following very simple snippet:

import sqlite3
import pandas as pd

connection = sqlite3.connect("/data/data/com.termux/files/home/.venv/test.apkg")
pd.read_sql_query("SELECT * FROM cards", connection)
klieret commented 3 years ago

Perhaps as an even simpler test to see if this is also unrelated to pandas:

import sqlite3

con = sqlite3.connect("/data/data/com.termux/files/home/.venv/test.apkg")
cursor = con.cursor()
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
print(cursor.fetchall())
cursor.close()
con.close()

this should print all tables in the file.

klieret commented 3 years ago

You could also directly try to use the command line interface of sqlite3 to enter SELECT name FROM sqlite_master WHERE type='table'; or SELECT * FROM cards and see if that works.

Blocked commented 3 years ago

@klieret Thanks ๐Ÿ‘ Only sqlite3 from commandline seems to work. But, that too, there's no cards in apkg, but zip:

sqlite3
sqlite> .open test.apkg
sqlite> .tables
zip
sqlite> select * from cards;
Error: no such table: cards

Select * from zip does provide scrambled data, but those actually contain the card data. And I still get the database error:file is not a database on pd.read_sql_query("SELECT * FROM cards", connection) in pandas.

Any direction?

klieret commented 3 years ago

Hmm, very interesting. Where is the anki database from? Is it from AnkiDroid? Then it looks like AnkiDroid is using a different format for its database than Anki desktop.

I've honestly never considered that someone might want to use this with AnkiDroid, so I've never tested this. I'm also curious about your use case ;)

klieret commented 3 years ago

@allcontributors please add @Blocked for bug

Blocked commented 3 years ago

I don't think there's any db structure difference. The apkg files are from https://ankiweb.net. eg: https://ankiweb.net/shared/info/965641886 Also, I can't find any difference in database structures in ankidroid here: https://github.com/ankidroid/Anki-Android/wiki/Database-Structure

My intention is to simply to manipulate cards on droid itself.

allcontributors[bot] commented 3 years ago

@klieret

I've put up a pull request to add @Blocked! :tada:

klieret commented 3 years ago

Okay, I see this too now (using this tiny deck).

So your use case then is to manipulate the cards before (!) you import them to your profile? Because what this tells me right now is that the format of shared decks seems to be different from that of the profile of a person (which makes sense, right?)

klieret commented 3 years ago

OK, but it seems like this is actually just a zip file. So if you unzip the file, you will get pictures, as well as a human readable file media and a database collection.anki2 which seems to be something that AnkiPandas should be able to read.

klieret commented 3 years ago

I've actually just tested with the deck and it works all fine :)

c = Collection("collection.anki2")
c.notes
c.cards
c.revs
Blocked commented 3 years ago

I don't think there's a difference between shared deck and exported deck. I checked a exported deck just now and I get the same zip table.

klieret commented 3 years ago

Interesting. So usually AnkiPandas works on the collection directly in the config directory of Anki (so not exported). I'll add a note about this to the documentation.

Blocked commented 3 years ago

@klieret Thank you!! It works after extracting collection.anki2

Blocked commented 3 years ago

If you think it'll be useful, consider adding direct support for shared deck apkg files as well(Testing, unzipping, extracting - modifying- and rezipping).

Thanks again๐Ÿ‘

klieret commented 3 years ago

I will definitely add example code in the next few days. zipfile from the standard library provides good support for opening files from a zip archive and writing them back, so maybe there is a relatively elegant way to handle this.

I am still not sure if I can add the read and especially write function to .apkg files directly to AnkiPandas without complicating things too much, but let's see :)

rpryzant commented 1 year ago

Love this thread as I was experiencing the same issue! Was this ever added to the docs?

e.g. I can't find it here https://ankipandas.readthedocs.io/en/latest/troubleshooting.html

Would be super useful to include native "exported deck" support

klieret commented 1 year ago

Thanks for the ping. Yes, indeed this wasn't added yet. I hope I can find some time this week!

klieret commented 1 year ago

Hi @rpryzant @Blocked I've implemented a first version of this in #139 (actually adding both read and write support). Do you want to test it? I still have to add some unit tests, but if you want you can already check out the branch and install from there for beta testing.

rpryzant commented 1 year ago

Thanks so much @klieret !

Hmm this isn't working for me, my Anki deck (attached here:https://www.dropbox.com/s/l4hqbuckxn86le7/anki-raw-3-23-23.apkg?dl=0) only worked when I extracted it first...

Code:

col = Collection('anki-raw-3-23-23/collection.anki21', user="User 1")
print(col.cards)

col = Collection('anki-raw-3-23-23.apkg', user="User 1")
print(col.cards)
quit()

Output:

image
rpryzant commented 1 year ago

The collection from the extracted deck has cards, but the collection from the raw deck only has one broken card. I hope that's helpful!

klieret commented 1 year ago

Thanks for the feedback, I hope I get to test the code against the example you submitted next week

klieret commented 1 year ago

Hi @rpryzant. Sorry for the late reply. I looked at your file and for me it simply seems to be corrupted.

Because I see the following output:

nid
1679532946726  M:Jo^S<Q/$  1679532946    -1    []  [This file requires a newer version of Anki., ]  Basic

I do not see any issue when I use apkgs that I find on the internet.