Netflix / genie

Distributed Big Data Orchestration Service
https://netflix.github.io/genie
Apache License 2.0
1.71k stars 367 forks source link

Fix exposing inconsistency in job status outside of persistence API #1153

Closed tgianos closed 2 years ago

tgianos commented 2 years ago

Within job launch logic there was a helper method which would query the job status and based on the returned value proceed with some logic to either update it or fall back to other logic. This works ok if all requests to the persistence service implementation go to a single cosistent backend. If, however, read only queries go to a read replica which may have lag or some other implementation entirely this breaks down without the service actually knowing why or how.

Moving the logic for this behind the persistence API and letting the launch service only act the returned job status from the source of truth api should fix this problem.

coveralls commented 2 years ago

Coverage Status

Coverage decreased (-0.01%) to 93.788% when pulling 556028cbd02bc7cf136a903f382ff54aacd46544 on tgianos:jobStatusFix into cfe9cfd0dfee750d0778936782472439d4e70719 on Netflix:master.