LineaLabs / lineapy

Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
https://lineapy.org
Apache License 2.0
662 stars 58 forks source link

Bump apache-airflow from 2.2.4 to 2.4.0 #805

Closed dependabot[bot] closed 1 year ago

dependabot[bot] commented 1 year ago

Bumps apache-airflow from 2.2.4 to 2.4.0.

Release notes

Sourced from apache-airflow's releases.

Apache Airflow 2.4.0

New Features

  • Add Data-aware Scheduling (https://github.com/apache/airflow/pulls?q=is%3Apr+is%3Amerged+label%3AAIP-48+milestone%3A%22Airflow+2.4.0%22)
  • Add @task.short_circuit TaskFlow decorator (#25752)
  • Make execution_date_or_run_id optional in tasks test command (#26114)
  • Automatically register DAGs that are used in a context manager (#23592, #26398)
  • Add option of sending DAG parser logs to stdout. (#25754)
  • Support multiple DagProcessors parsing files from different locations. (#25935)
  • Implement ExternalPythonOperator (#25780)
  • Make execution_date optional for command dags test (#26111)
  • Implement expand_kwargs() against a literal list (#25925)
  • Add trigger rule tooltip (#26043)
  • Add conf parameter to CLI for airflow dags test (#25900)
  • Include scheduled slots in pools view (#26006)
  • Add output property to MappedOperator (#25604)
  • Add roles delete command to cli (#25854)
  • Add Airflow specific warning classes (#25799)
  • Add support for TaskGroup in ExternalTaskSensor (#24902)
  • Add @task.kubernetes taskflow decorator (#25663)
  • Add a way to import Airflow without side-effects (#25832)
  • Let timetables control generated run_ids. (#25795)
  • Allow per-timetable ordering override in grid view (#25633)
  • Grid logs for mapped instances (#25610, #25621, #25611)
  • Consolidate to one schedule param (#25410)
  • DAG regex flag in backfill command (#23870)
  • Adding support for owner links in the Dags view UI (#25280)
  • Ability to clear a specific DAG Run's task instances via REST API (#23516)
  • Possibility to document DAG with a separate markdown file (#25509)
  • Add parsing context to DAG Parsing (#25161)
  • Implement CronTriggerTimetable (#23662)
  • Add option to mask sensitive data in UI configuration page (#25346)
  • Create new databases from the ORM (#24156)
  • Implement XComArg.zip(*xcom_args) (#25176)
  • Introduce sla_miss metric (#23402)
  • Implement map() semantic (#25085)
  • Add override method to TaskGroupDecorator (#25160)
  • Implement expand_kwargs() (#24989)
  • Add parameter to turn off SQL query logging (#24570)
  • Add DagWarning model, and a check for missing pools (#23317)
  • Add Task Logs to Grid details panel (#24249)
  • Added small health check server and endpoint in scheduler(#23905)
  • Add built-in External Link for ExternalTaskMarker operator (#23964)
  • Add default task retry delay config (#23861)
  • Add clear DagRun endpoint. (#23451)
  • Add support for timezone as string in cron interval timetable (#23279)
  • Add auto-refresh to dags home page (#22900, #24770)

Improvements

... (truncated)

Changelog

Sourced from apache-airflow's changelog.

Airflow 2.4.0 (2022-09-19)

Significant Changes ^^^^^^^^^^^^^^^^^^^

Data-aware Scheduling and Dataset concept added to Airflow """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

New to this release of Airflow is the concept of Datasets to Airflow, and with it a new way of scheduling dags: data-aware scheduling.

This allows DAG runs to be automatically created as a result of a task "producing" a dataset. In some ways this can be thought of as the inverse of TriggerDagRunOperator, where instead of the producing DAG controlling which DAGs get created, the consuming DAGs can "listen" for changes.

A dataset is identified by a URI:

.. code-block:: python

from airflow import Dataset

The URI doesn't have to be absolute

dataset = Dataset(uri='my-dataset')

Or you can use a scheme to show where it lives.

dataset2 = Dataset(uri='s3://bucket/prefix')

To create a DAG that runs whenever a Dataset is updated use the new schedule parameter (see below) and pass a list of 1 or more Datasets:

.. code-block:: python

with DAG(dag_id='dataset-consmer', schedule=[dataset]):
    ...

And to mark a task as producing a dataset pass the dataset(s) to the outlets attribute:

.. code-block:: python

@task(outlets=[dataset])
def my_task():
    ...

Or for classic operators

BashOperator(task_id="update-ds", bash_command=..., outlets=[dataset])

If you have the producer and consumer in different files you do not need to use the same Dataset object, two Dataset()\s created with the same URI are equal.

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/LineaLabs/lineapy/network/alerts).
lionsardesai commented 1 year ago

not a concern. we do not have an airflow dependency. closing due to noise.

dependabot[bot] commented 1 year ago

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.