polm / posuto

🏣📮〠 Japanese postal code data.
MIT License
204 stars 11 forks source link

Suggestion: build and distribute as a SQLite DB file #3

Closed simonw closed 3 years ago

simonw commented 3 years ago

This library currently distributes data as a built JSON file, which has to be read into memory in full.

If the library used a binary SQLite DB file it could run lookups against an index and avoid needing to load the entire file into memory.

Since Python includes sqlite3 in the Python standard library this could be done without needing any extra library dependencies.

polm commented 3 years ago

That's a great idea - thanks for the suggestion.

I've never packaged a library like that before; what's the right way to handle connections? Should I open a connection at import time and re-use it or make a connection for every query?

polm commented 3 years ago

So I realized one issue with this - because JSON can be decompressed at runtime, it only takes up roughly 3MB on disk. In contrast vanilla sqlite can't be compressed on disk, so it's 70MB. That's not a big deal for most applications (and it's compressed for download anyway) but it's something to keep in mind.

I have working code doing this in the feature-sqlite branch.

polm commented 3 years ago

Just released this as v0.2.0.