Closed lemon24 closed 3 years ago
This has more subtleties than you'd think.
sqlite3 (the Python module) doesn't support non-naive datetimes – they get written fine, but we get a ValueError: invalid literal for int() with base 10: b'00+00'
when trying to get them out. This is a known bug; a fix was rejected because there's another issue to deprecate adapters; this being stdlib, it won't likely won't be.
We can either install the converter in the patch globally with sqlite3.register_converter
(there's no per-db-connection registry), or do our own conversion.
There are two ways to store the timezone; either have a duplicate ("computed") column for the UTC equivalent and sort by that, or always sort by datetime(<some datetime column>)
(which converts to UTC on the fly). The tradeoff is what you'd expect.
If we want to store the timezone, we have to implement the above, and we have to force all future storage implementation to do it too (supporting both versions makes things needlessly complicated).
Alternatively, we can leave all the internals use naive (assumed UTC) datetimes, and only convert them to datetimes with explicit (UTC-only) timezones in the high level API (so we're still doing the right thing for the high level / external users). This seems OK for now.
Frankly, I don't have a strong use case to store the original timezone (yeah, it would be nice, but YAGNI).
And only some feeds use a non-UTC timezone anyway; here are some counts for my database for updated or published
:
The fake parser returns naive datetimes; we then compare them with those returned by Reader. This means a lot (read: all) of tests in test_reader_*.py
will be broken by Reader returning aware datetimes.
Assuming we're going with the skin-deep timezone awareness, we'll have to find a solution that doesn't require fixing all of the tests.
It's the right thing: https://blog.ganssle.io/articles/2019/11/utcnow.html
Can only be done in 2.0.