kerrickstaley / genanki

A Python 3 library for generating Anki decks
MIT License
2.06k stars 161 forks source link

Get pkg as bytes #54

Closed dallonasnes closed 3 years ago

dallonasnes commented 4 years ago

Added method to get package as bytes object. I used this to build a package on a server and send to client over HTTP for download. Includes unit test that verifies bytes package is still a valid anki package.

kerrickstaley commented 4 years ago

Thanks for this PR :smile:, and special thanks for adding a test for the functionality.

I agree that this is an important use-case, but I'm on the fence about whether to merge this PR, because I'd like to keep genanki simple, which helps make it easier to maintain.

Let's look at how you could do this without this helper method:

fnode, fpath = tempfile.mkstemp()
os.close(fnode)
my_package.write_to_file(fpath)
with open(fpath, 'rb') as fhandle:
    data = fhandle.read()
os.unlink(fpath)

That's six lines of code, of medium complexity (you're probably going to have to spend some time reading the tempfile docs to get this working). One disadvantage of this code is that it writes the Zip file to disk, whereas your version keeps it in memory. One advantage is that the code path is exactly the same for on-disk vs in-memory; there is no potential for divergence. A neutral thing is that both implementations use temporary files; currently, it's not possible, using the Python API, to get a SQLite database's disk representation without writing it to disk.

I think if I were to merge this, I would re-write it to use the above implementation, because I don't think the performance advantages to keeping things in-memory are that significant (but would love to be proven wrong about this). So then the question is whether to add a helper method that doesn't use any internal APIs (so the user could write it themselves). I will sleep on it and get back to you.

If issue 41930 is addressed, then we could create a get_pkg_as_bytes implementation that does not use temporary files at all, which would be a little more compelling, because it seems there are some Python implementations (rare, but they exist) that do not have proper tempfile support.

dallonasnes commented 4 years ago

thanks for taking a look and for your analysis. i see how your snippet avoids a divergent code path. And I agree the performance difference between the in-memory vs disk version isn't that important. In fact, I wasn't thinking about performance so much as just wanting to avoid writing and cleaning up .apkg files when I had made this change. And I hope to see that issue 41930 get addressed.

kerrickstaley commented 3 years ago

Hey! After thinking about this, I decided that it's best for this feature to live outside the genanki library, again to keep the library as minimal as possible. Thanks again for the PR!