Closed sentry-io[bot] closed 9 months ago
Hey @oscgonfer - I think this has to do with a migration that hasn't been run - can you re-run rake db:migrate
in production and check whether it does anything, and also send me the results of this query on the production database?
select id, created_at, updated_at from devices where workflow_state='archived' and archived_at is null;
basically the problem arises when a device has a workflow_state of 'archived' but no 'archived_at' date, something that shouldn't be happening!
Here is the outcome of the doco exec app rake db:migrate
output:
D, [2023-09-26T14:11:06.978515 #34] DEBUG -- sentry: Sentry HTTP Transport will connect to https://o373800.ingest.sentry.io
D, [2023-09-26T14:11:06.979305 #34] DEBUG -- sentry: Initializing the background worker with 6 threads
D, [2023-09-26T14:11:08.211929 #34] DEBUG -- : (0.5ms) SELECT pg_try_advisory_lock(7014306594500969820)
D, [2023-09-26T14:11:08.233418 #34] DEBUG -- : (2.3ms) SELECT "schema_migrations"."version" FROM "schema_migrations" ORDER BY "schema_migrations"."version" ASC
D, [2023-09-26T14:11:08.245891 #34] DEBUG -- : ActiveRecord::InternalMetadata Load (1.0ms) SELECT "ar_internal_metadata".* FROM "ar_internal_metadata" WHERE "ar_internal_metadata"."key" = $1 LIMIT $2 [["key", "environment"], ["LIMIT", 1]]
D, [2023-09-26T14:11:08.256146 #34] DEBUG -- : (1.0ms) SELECT pg_advisory_unlock(7014306594500969820)
D, [2023-09-26T14:11:08.263350 #34] DEBUG -- sentry: Shutting down background worker
D, [2023-09-26T14:11:08.263597 #34] DEBUG -- sentry: Killing session flusher
And this is the outcome in postgres
query:
select id, created_at, updated_at from devices where workflow_state='archived' and archived_at is null;
Output:
id | created_at | updated_at
-------+----------------------------+----------------------------
16518 | 2023-07-17 02:27:05.432568 | 2023-07-18 16:02:16.055239
10555 | 2019-11-27 11:22:50.083728 | 2023-07-17 16:35:13.959288
16472 | 2023-06-30 09:25:56.076917 | 2023-07-17 14:29:33.82087
16502 | 2023-07-12 17:11:12.09499 | 2023-07-17 14:28:18.103765
16515 | 2023-07-15 18:22:14.306812 | 2023-07-15 18:30:20.63617
16433 | 2023-06-16 06:14:48.49726 | 2023-07-15 17:47:29.167594
16489 | 2023-07-05 18:55:55.39562 | 2023-07-13 18:25:40.524085
16511 | 2023-07-13 16:09:34.424832 | 2023-07-13 16:18:07.183399
16509 | 2023-07-13 10:01:48.623642 | 2023-07-13 10:24:16.059672
16503 | 2023-07-12 18:48:06.203876 | 2023-07-12 22:44:53.370171
16506 | 2023-07-12 19:40:59.762902 | 2023-07-12 19:48:57.405454
16505 | 2023-07-12 19:19:05.419242 | 2023-07-12 19:38:50.094216
16496 | 2023-07-11 17:12:47.302893 | 2023-07-12 15:40:17.480394
16491 | 2023-07-07 20:31:06.888604 | 2023-07-12 15:40:16.909974
16497 | 2023-07-11 17:23:18.608334 | 2023-07-12 15:40:10.217591
16498 | 2023-07-11 19:02:18.501328 | 2023-07-12 15:40:05.240243
16488 | 2023-07-05 16:38:43.624591 | 2023-07-05 17:32:18.193923
16485 | 2023-07-04 20:27:01.163499 | 2023-07-05 16:42:08.481627
(18 rows)
However, I see other things too. For instance, there are other archived devices that haven't been deleted yet since we rolled this out in summer:
select id, created_at, archived_at from devices where workflow_state='archived';
Output:
id | created_at | archived_at
-------+----------------------------+----------------------------
16518 | 2023-07-17 02:27:05.432568 |
10555 | 2019-11-27 11:22:50.083728 |
16472 | 2023-06-30 09:25:56.076917 |
16502 | 2023-07-12 17:11:12.09499 |
16515 | 2023-07-15 18:22:14.306812 |
16433 | 2023-06-16 06:14:48.49726 |
16489 | 2023-07-05 18:55:55.39562 |
16511 | 2023-07-13 16:09:34.424832 |
16509 | 2023-07-13 10:01:48.623642 |
16503 | 2023-07-12 18:48:06.203876 |
16506 | 2023-07-12 19:40:59.762902 |
16505 | 2023-07-12 19:19:05.419242 |
16496 | 2023-07-11 17:12:47.302893 |
16491 | 2023-07-07 20:31:06.888604 |
16497 | 2023-07-11 17:23:18.608334 |
16498 | 2023-07-11 19:02:18.501328 |
16488 | 2023-07-05 16:38:43.624591 |
16485 | 2023-07-04 20:27:01.163499 |
14674 | 2021-10-25 09:48:47.14252 | 2023-07-05 10:02:56.894366
4859 | 2018-10-09 14:12:59.83813 | 2023-07-05 10:02:56.894366
9472 | 2019-05-29 09:34:46.960913 | 2023-07-05 10:02:56.894366
9474 | 2019-05-29 09:38:04.298199 | 2023-07-05 10:02:56.894366
4796 | 2018-09-20 09:13:25.208197 | 2023-07-05 10:02:56.894366
4776 | 2018-09-14 19:39:48.178585 | 2023-07-05 10:02:56.894366
4860 | 2018-10-09 14:24:06.379064 | 2023-07-05 10:02:56.894366
4845 | 2018-10-05 17:47:12.037375 | 2023-07-05 10:02:56.894366
9391 | 2019-04-11 14:47:40.338492 | 2023-07-05 10:02:56.894366
4791 | 2018-09-20 07:10:02.899696 | 2023-07-05 10:02:56.894366
4792 | 2018-09-20 07:25:41.214083 | 2023-07-05 10:02:56.894366
9387 | 2019-04-10 11:44:00.984849 | 2023-07-05 10:02:56.894366
15130 | 2022-02-28 12:55:59.640611 | 2023-07-05 10:02:56.894366
4774 | 2018-09-14 16:18:12.630088 | 2023-07-05 10:02:56.894366
14489 | 2021-07-28 10:44:29.86678 | 2023-07-05 10:02:56.894366
9459 | 2019-05-24 16:05:45.989368 | 2023-07-05 10:02:56.894366
15050 | 2022-02-15 11:08:02.737447 | 2023-07-05 10:02:56.894366
...
Aha, ok, interesting. I'll take a look. I think as a sensible workaround, it might be worth just allowing the device archive job to delete archived devices WITHOUT an archived_at (as all the ones that exist are now old enough to have been archived anyway). I'll check what's going on with those other ones that should've been deleted though!
Sentry Issue: SC-API-AC