Closed rebeccacremona closed 1 year ago
A little unexpected that the default
didn't kick in: to be investigated.
Update: I believed from my initial review of the emails about this error that a problem was happening for API users. That's incorrect.
We saw fewer than a dozen integrity errors like the above, a few for each kind of user (GUI users at perma.cc/api/v1, API users at api.perma.cc authing via headers, API users at api.perma.cc authing via the querystring).... all at 11:27AM on November the 18th. That was right at the very tail end of a deployment where the default-to-screenshot feature was introduced:
So here's what I'm pretty sure happened.
Somehow or other, the database got migrated and the new non-nullable column was added.... but a few lingering requests somehow were processed using the old application code in the meantime (Django supplies default values via the python, not by designating a default value for the column at the db level).
We enter maintenance mode as the first step of a deployment, preventing additional requests, but what happens to requests that are already in-flight is a bit of a mystery to me. This suggests that uwsgi continued with the already in-flight requests normally, and then, after it got bounced, picked up new requests with the new code as expected... and in this case, the DB migration was fast enough that the in-flight requests talked to the migrated DB.
How about that.
It hasn't happened again since, and I can't reproduce.
Bringing to @bensteinberg's attention, in case we want to think about implications for deployment strategies, but also closing as #wontfix.