ArpNetworking / metrics-portal

2 stars 12 forks source link

EBeanServer should treat dates it receives as UTC #419

Open cwbriones opened 4 years ago

cwbriones commented 4 years ago

This might only be a problem with the test EbeanServer but I'm opening this regardless to confirm.

In #398 I discovered an issue where the Ebean was using the local time zone to compute the partition date for execution records. The repository was using UTC. This causes an issue where the repo thinks that it's created the partition for the given record, but the insert still fails because the dates don't actually match due to the timezone discrepancy.

I was able to fix this by manually configuring the server instance used by the integration tests, but this may still be an issue in production. I also am afraid that this swap-over will mess up existing repositories keyed by date, since a key that would have previously mapped to local time will be mapped to UTC after the configuration change. I believe the only existing repo that uses this is ReportExecutionRepository, but I haven't confirmed this just yet.

cwbriones commented 4 years ago

I've confirmed that the only models that use any keys that are timestamps are the execution classes. ReportExecution and AlertExecution.

cwbriones commented 4 years ago

log snippet from the failing tests:

2020-06-22 17:53:03,557 default-akka.actor.default-dispatcher-4 [INFO] com.arpnetworking.metrics.portal.alerts.impl.DailyPartitionCreator : name="log", line="257", file="DailyPartitionCreator.java", class="class com.arpnetworking.metrics.portal.alerts.impl.DailyPartitionCreator", message="Creating daily partitions for table", schema="portal", table="alert_executions", startDate="2020-06-23", endDate="2020-06-28"
2020-06-22 17:53:03,557 default-akka.actor.default-dispatcher-4 [DEBUG] io.ebean.SQL : txn[1003] select * from create_daily_partition(?::text, ?::text, ?::date, ?::date); --bind(portal, alert_executions, 2020-06-23, 2020-06-28)
2020-06-22 17:53:03,574 default-akka.actor.default-dispatcher-4 [DEBUG] io.ebean.SUM : txn[1003] SqlQuery  rows[1] micros[17165] bind[portal, alert_executions, 2020-06-23, 2020-06-28]
[INFO] [06/22/2020 17:53:03.578] [default-akka.actor.default-dispatcher-2] [akka://default/deadLetters] Message [akka.actor.Status$Success] from Actor[akka://default/user/$a#-1087505711] to Actor[akka://default/deadLetters] was not delivered. [1] dead letters encountered. If this is not an expected behavior, then [Actor[akka://default/deadLetters]] may have terminated unexpectedly, This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2020-06-22 17:53:03,602 main [TRACE] io.ebean.TXN : txn[1006] Begin
2020-06-22 17:53:03,606 main [DEBUG] io.ebean.SQL : txn[1006] select t0.id, t0.uuid, t0.version, t0.created_at, t0.updated_at from portal.organizations t0 where t0.uuid = ? ; --bind(6ab72ba3-4289-4b8f-8638-d98cbb656995)
2020-06-22 17:53:03,606 main [DEBUG] io.ebean.SUM : txn[1006] FindMany type[Organization] origin[wECxR.A.A] exeMicros[3583] rows[1] predicates[t0.uuid = ? ] bind[6ab72ba3-4289-4b8f-8638-d98cbb656995]
2020-06-22 17:53:03,610 main [DEBUG] io.ebean.SQL : txn[1006] select t0.organization_id, t0.alert_id, t0.scheduled, t0.scheduled, t0.alert_id, t0.state, t0.started_at, t0.completed_at, t0.error, t0.result, t0.postgres> 2020-06-23 00:53:03.612 UTC organization_id from portal.alert_executions t0 join portal.organizations t1 on t1.id = t0.organization_id  where t1.uuid = ?  and t0.scheduled = ? ; --bind(6ab72ba3-4289-4b8f-8638-d98cbb656995,2020-06-23T00:53:03.601Z)
2020-06-22 17:53:03,610 main [DEBUG] io.ebean.SUM : txn[1006] FindMany type[AlertExecution] origin[rFWCu.A.A] exeMicros[3442] rows[0] predicates[t1.uuid = ?  and t0.scheduled = ? ] bind[6ab72ba3-4289-4b8f-8638-d98cbb656995,2020-06-23T00:53:03.601Z]
2020-06-22 17:53:03,610 main [DEBUG] io.ebean.SQL : txn[1006] insert into portal.alert_executions (organization_id, alert_id, scheduled, state, started_at, completed_at, error, result) values (?,?,?,?,?,?,?,?); --bind(4,7abed47c-1de6-4727-89ed-062536824e96,2020-06-23T00:53:03.601Z,STARTED,2020-06-23T00:53:03.610Z,null,null,null)
2020-06-22 17:53:03,614 main [DEBUG] io.ebean.SUM : txn[1006] Error[ERROR: no partition of relation "alert_executions" found for row   Detail: Partition key of the failing row contains (scheduled) = (2020-06-22 17:53:03.601).]
2020-06-22 17:53:03,616 main [TRACE] io.ebean.TXN : txn[1006] Commit - query only
2020-06-22 17:53:03,617 main [ERROR] com.arpnetworking.metrics.portal.alerts.impl.DatabaseAlertExecutionRepository : name="log", line="204", file="DatabaseExecutionHelper.java", class="class com.arpnetworking.metrics.portal.scheduling.impl.DatabaseExecutionHelper", message="Failed to job report executions", job.uuid="7abed47c-1de6-4727-89ed-062536824e96", scheduled="2020-06-23T00:53:03.601Z", state="STARTED" io.ebean.DataIntegrityException: Error[ERROR: no partition of relation "alert_executions" found for row   Detail: Partition key of the failing row contains (scheduled) = (2020-06-22 17:53:03.601).]
    at io.ebean.config.dbplatform.SqlCodeTranslator.translate(SqlCodeTranslator.java:48)
    at io.ebean.config.dbplatform.DatabasePlatform.translate(DatabasePlatform.java:211)
    at io.ebeaninternal.server.persist.dml.DmlBeanPersister.execute(DmlBeanPersister.java:83)
    at io.ebeaninternal.server.persist.dml.DmlBeanPersister.insert(DmlBeanPersister.java:49)
    at io.ebeaninternal.server.core.PersistRequestBean.executeInsert(PersistRequestBean.java:1217)
    at io.ebeaninternal.server.core.PersistRequestBean.executeNow(PersistRequestBean.java:764)
    at io.ebeaninternal.server.core.PersistRequestBean.ex[113] ERROR:  no partition of relation "alert_executions" found for row
postgres> DETAIL:  Partition key of the failing row contains (scheduled) = (2020-06-22 17:53:03.601)

Notice that the in-app log statement logs scheduled="2020-06-23T00:53:03.601Z" while the Postgres error message logs 2020-06-22 17:53:03.601, the local time.

To reiterate the current hypothesis given this context: because the UTC date is ahead of local time, the partition creator creates 2020-06-23 but the insert fails to find a partition for 2020-06-22. Postgres is attempting to find a partition for 6-22 because the JDBC connection initialized by ebean is using the local JVM time (America/Los_Angeles). The time objects used in the application are all Instant and therefore do not carry any zone information with them, explaining the discrepancy with how they are interpreted.

The partition creator is unaffected despite using the same pool because the partition date is computed in the application code rather than by the database, so it's still using UTC. It's somewhat ironic that setting the VM timezone is manifesting in the database being wrong and the application being right.