NYPL-Simplified / server_core

Shared data model and utilities for Library Simplified server applications
7 stars 11 forks source link

UTC Datetime Update #1256

Closed EdwinGuzman closed 3 years ago

EdwinGuzman commented 3 years ago

Helps resolve SIMPLY-3663.

I originally updated the syntax for all utcnow and utcfromtimestamp functions to now and fromtimestamp, respectively with the appropriate params. I decided to go with the pytz.UTC value rather than datetime.timezone.utc because the standard library recommends using pytz. I then noticed that even though timezone-aware dates were being created and inserted into the database, that the value queried from the database lost its timezone (from test failures).

I then updated db timestamp columns to be Column(DateTime(timezone=True), nullable=False) and this worked. But here's where I wasn't sure if this needs a migration. I can't be sure if older data without a timestamp actually causes an issue when it's being read. I tried a migration script that has ALTER TABLE cachedfeeds ALTER COLUMN "timestamp" SET DATA TYPE timestamptz;, for example. This returns timezone-aware datetime objects from the database but running the script potentially crashed my app server. With new databases, running ./core/bin/initialize_database returned Illegal instruction: 4. Will need help on this next week.

EdwinGuzman commented 3 years ago

Update: I found that the Column(DateTime(timezone=True), ...) crashes the circulation and metadata wrangler flask apps but they don't give any good errors or warnings (with and without the migration update). It looks like postgres stores dates in UTC regardless of whether the python datetime object is naive or aware, so maybe the best approach right now is to manually "convert" the date from the db into an aware object through .replace(tzinfo=pytz.UTC). So far this is working and fixes broken tests, but it's a much more tedious approach and we'd have to remember this whenever we make any updates.

leonardr commented 3 years ago

What do the errors and warnings look like, even if they're not helpful? I would like to make sure that datetimes always come out of the database in a timezone-aware form.

The code is fine as far as it goes but I can tell it's going to cause problems keeping everything straight. What do you think about introducing some helper functions to at least make it easier?

now() -- shortcut for datetime.datetime.now(tz=pytz.UTC) utc(datetime) -- shortcut for datetime.replace(tzinfo=pytz.UTC)

I wish there was some way to just stop naive datetimes from coming into the system at all...

EdwinGuzman commented 3 years ago

What do the errors and warnings look like, even if they're not helpful? I would like to make sure that datetimes always come out of the database in a timezone-aware form.

There are no warnings or errors. After running python app.py and going to the localhost page, the server just stops. If I use flask run, then an Illegal instruction: 4 error appears before the server stops. I tried debugging to see what happens and it looks like there are still some naive/aware dates being compared and the server crashes rather than giving an error.

What do you think about introducing some helper functions to at least make it easier?

I was thinking about this but it's easy to not use the helper function and write the native implementation. It made more sense to me to write it out every time so I get used to the syntax rather than relying on a helper function but I can change this, it's not a big deal.

Without Column(DateTime(timezone=True)), I have to make updates like the following (since dates coming from the db are not aware types) to fix tests and it's a lot more tedious:

feed_obj, is_new = get_one_or_create(_db, cls, **kwargs)
if feed_obj.timestamp:
    feed_obj.timestamp = feed_obj.timestamp.replace(tzinfo=pytz.UTC)

So I'm going to try again with the db migration in place again and keep debugging to see where there are still naive/aware comparisons being made.

tdilauro commented 3 years ago

I think the helpers could be handy, but now() could be confusing, since someone might assume that it was imported from datetime. Care should be taken to avoid that confusion.

EdwinGuzman commented 3 years ago

Updates:

I think something is inherently wrong with my setup where the app fails to run after its initial run. I thought it had something to do with this branch, the migration script, and aware and naive dates being compared, but this issue happens on new installs on other branches.

This makes me think something else is wrong on my local machine and makes this branch hard to test. I'm going to keep digging my issue but if possible, @leonardr can you try this branch and the migration script on one of your circ managers?

leonardr commented 3 years ago

Running the migration script shows migrations for a couple of fields that I don't think are in the model:

ERROR:  column "created" does not exist
LINE 1: update credentials set created = (created at time zone 'utc'...
                                          ^
ERROR:  column "updated" does not exist
LINE 1: update credentials set updated = (updated at time zone 'utc'...

That's not pertinent to what you're asking though; I'm still investigating.

leonardr commented 3 years ago

I'm able to run the python3 circulation branch with this core branch without an issue. Let's work this out interactively.

EdwinGuzman commented 3 years ago

The migration script issue was bad copying and pasting but that's fixed now.