aws / amazon-mwaa-docker-images

Apache License 2.0
24 stars 11 forks source link

Make DB condition informational + Log sniffing #76

Closed rafidka closed 3 months ago

rafidka commented 3 months ago

Issue #, if available: N/A

Description of changes:

Previously, AirflowDbReachableCondition would result in a container restart if there is a problem reaching the database. While this is the desired result eventually, I am changing it for now to be informational only, i.e. report connection problem, for two reasons:

  1. This is the current behaviour in our internal images, so we want to reduce how much we deviate from it.
  2. More importantly, a restart would need to be a bit more intelligent to avoid unnecessary restarts, which results in a considerable wait time while the container is being replaced.

The re-introduction of restarts will be tracked in Issue #75.

Along with this PR, I also re-introduced the generation of processable logs to capture the DB health metrics in the MWAA service.

Finally, the PR also includes the porting of the log sniffing logic to detect common Airflow problems.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.