Previously, AirflowDbReachableCondition would result in a container restart if there is a problem reaching the database. While this is the desired result eventually, I am changing it for now to be informational only, i.e. report connection problem, for two reasons:
This is the current behaviour in our internal images, so we want to reduce how much we deviate from it.
More importantly, a restart would need to be a bit more intelligent to avoid unnecessary restarts, which results in a considerable wait time while the container is being replaced.
The re-introduction of restarts will be tracked in Issue #75.
Along with this PR, I also re-introduced the generation of processable logs to capture the DB health metrics in the MWAA service.
Finally, the PR also includes the porting of the log sniffing logic to detect common Airflow problems.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Issue #, if available: N/A
Description of changes:
Previously,
AirflowDbReachableCondition
would result in a container restart if there is a problem reaching the database. While this is the desired result eventually, I am changing it for now to be informational only, i.e. report connection problem, for two reasons:The re-introduction of restarts will be tracked in Issue #75.
Along with this PR, I also re-introduced the generation of processable logs to capture the DB health metrics in the MWAA service.
Finally, the PR also includes the porting of the log sniffing logic to detect common Airflow problems.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.