Azure / azure-databricks-operator

Kubernetes Operator for Databricks
MIT License
113 stars 48 forks source link

Sets Run to terminal state if it has been deleted from Databricks first #158

Closed magencio closed 4 years ago

magencio commented 4 years ago

Fix for issue #156: Reconciler error when refreshing a Run that has been deleted in Databricks.

stuartleeks commented 4 years ago

Update - I've now managed to run a load test with this change.

tldr - this looks ok. I'm going to remove the not-for-review label

Load pattern:

image

API calls (shows when mock api failure behaviour was configured: 20% status code 500 responses in first batch, 100% in second batch):

image

Mean value for poll_run_await_completion (which is a proxy for latency in detecting run completion) stays within reasonable value:

image

Controller reconciliation rate dips with status 500 responses, but recovers well image

Work queue etc look busy, but doesn't keep growing (and poll_run_await_completion was ok above)

image

stuartleeks commented 4 years ago

/azp run

azure-pipelines[bot] commented 4 years ago
Azure Pipelines successfully started running 1 pipeline(s).
Azadehkhojandi commented 4 years ago

/azp run

azure-pipelines[bot] commented 4 years ago
Azure Pipelines successfully started running 1 pipeline(s).