GoogleCloudPlatform / oozie-to-airflow

Oozie Workflow to Airflow DAGs migration tool
Apache License 2.0
87 stars 50 forks source link

chore(deps): bump apache-airflow from 1.10.14 to 2.6.0 #694

Closed dependabot[bot] closed 1 year ago

dependabot[bot] commented 1 year ago

Bumps apache-airflow from 1.10.14 to 2.6.0.

Release notes

Sourced from apache-airflow's releases.

Apache Airflow 2.6.0

Significant Changes

Default permissions of file task handler log directories and files has been changed to "owner + group" writeable (#29506).

Default setting handles case where impersonation is needed and both users (airflow and the impersonated user) have the same group set as main group. Previously the default was also other-writeable and the user might choose to use the other-writeable setting if they wish by configuring file_task_handler_new_folder_permissions and file_task_handler_new_file_permissions in logging section.

SLA callbacks no longer add files to the dag processor manager's queue (#30076)

This stops SLA callbacks from keeping the dag processor manager permanently busy. It means reduced CPU, and fixes issues where SLAs stop the system from seeing changes to existing dag files. Additional metrics added to help track queue state.

The cleanup() method in BaseTrigger is now defined as asynchronous (following async/await) pattern (#30152).

This is potentially a breaking change for any custom trigger implementations that override the cleanup() method and uses synchronous code, however using synchronous operations in cleanup was technically wrong, because the method was executed in the main loop of the Triggerer and it was introducing unnecessary delays impacting other triggers. The change is unlikely to affect any existing trigger implementations.

The gauge scheduler.tasks.running no longer exist (#30374)

The gauge has never been working and its value has always been 0. Having an accurate value for this metric is complex so it has been decided that removing this gauge makes more sense than fixing it with no certainty of the correctness of its value.

Consolidate handling of tasks stuck in queued under new task_queued_timeout config (#30375)

Logic for handling tasks stuck in the queued state has been consolidated, and the all configurations responsible for timing out stuck queued tasks have been deprecated and merged into [scheduler] task_queued_timeout. The configurations that have been deprecated are [kubernetes] worker_pods_pending_timeout, [celery] stalled_task_timeout, and [celery] task_adoption_timeout. If any of these configurations are set, the longest timeout will be respected. For example, if [celery] stalled_task_timeout is 1200, and [scheduler] task_queued_timeout is 600, Airflow will set [scheduler] task_queued_timeout to 1200.

Improvement Changes

Display only the running configuration in configurations view (#28892)

The configurations view now only displays the running configuration. Previously, the default configuration was displayed at the top but it was not obvious whether this default configuration was overridden or not. Subsequently, the non-documented endpoint /configuration?raw=true is deprecated and will be removed in Airflow 3.0. The HTTP response now returns an additional Deprecation header. The /config endpoint on the REST API is the standard way to fetch Airflow configuration programmatically.

Explicit skipped states list for ExternalTaskSensor (#29933)

ExternalTaskSensor now has an explicit skipped_states list

Miscellaneous Changes

Handle OverflowError on exponential backoff in next_run_calculation (#28172)

Maximum retry task delay is set to be 24h (86400s) by default. You can change it globally via core.max_task_retry_delay parameter.

... (truncated)

Changelog

Sourced from apache-airflow's changelog.

Airflow 2.6.0 (2023-04-30)

Significant Changes ^^^^^^^^^^^^^^^^^^^

Default permissions of file task handler log directories and files has been changed to "owner + group" writeable (#29506). """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Default setting handles case where impersonation is needed and both users (airflow and the impersonated user) have the same group set as main group. Previously the default was also other-writeable and the user might choose to use the other-writeable setting if they wish by configuring file_task_handler_new_folder_permissions and file_task_handler_new_file_permissions in logging section.

SLA callbacks no longer add files to the dag processor manager's queue (#30076) """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" This stops SLA callbacks from keeping the dag processor manager permanently busy. It means reduced CPU, and fixes issues where SLAs stop the system from seeing changes to existing dag files. Additional metrics added to help track queue state.

The cleanup() method in BaseTrigger is now defined as asynchronous (following async/await) pattern (#30152). """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" This is potentially a breaking change for any custom trigger implementations that override the cleanup() method and uses synchronous code, however using synchronous operations in cleanup was technically wrong, because the method was executed in the main loop of the Triggerer and it was introducing unnecessary delays impacting other triggers. The change is unlikely to affect any existing trigger implementations.

The gauge scheduler.tasks.running no longer exist (#30374) """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" The gauge has never been working and its value has always been 0. Having an accurate value for this metric is complex so it has been decided that removing this gauge makes more sense than fixing it with no certainty of the correctness of its value.

Consolidate handling of tasks stuck in queued under new task_queued_timeout config (#30375) """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Logic for handling tasks stuck in the queued state has been consolidated, and the all configurations responsible for timing out stuck queued tasks have been deprecated and merged into [scheduler] task_queued_timeout. The configurations that have been deprecated are [kubernetes] worker_pods_pending_timeout, [celery] stalled_task_timeout, and [celery] task_adoption_timeout. If any of these configurations are set, the longest timeout will be respected. For example, if [celery] stalled_task_timeout is 1200, and [scheduler] task_queued_timeout is 600, Airflow will set [scheduler] task_queued_timeout to 1200.

Improvement Changes ^^^^^^^^^^^^^^^^^^^

Display only the running configuration in configurations view (#28892) """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" The configurations view now only displays the running configuration. Previously, the default configuration was displayed at the top but it was not obvious whether this default configuration was overridden or not. Subsequently, the non-documented endpoint /configuration?raw=true is deprecated and will be removed in Airflow 3.0. The HTTP response now returns an additional Deprecation header. The /config endpoint on

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/GoogleCloudPlatform/oozie-to-airflow/network/alerts).
dependabot[bot] commented 1 year ago

Superseded by #697.