apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
35.67k stars 13.89k forks source link

Additional cron expression validation #16731

Open uranusjr opened 3 years ago

uranusjr commented 3 years ago

Description Airflow should outline (in documentation) what cron expressions are supported, and perform appropriate additional validation on user-supplied cron expressions.

Use case / motivation

16692 proposed displaying a short description about the cron expression in the web UI. This bring out the issue that currently the cron expression formats supported by Airflow are essentially defined by croniter implementation, which is non-standard (POSIX only allows five-segment formats) and sometimes buggy (see https://github.com/apache/airflow/issues/16107#issuecomment-869559200). Since croniter itself does not offer expression description, we need to do this with another tool, meaning that non-standard expressions can be incorrectly described and cause user confusion.

It would be beneficial for Airflow to explicitly specify what cron expression formats are supported (only POSIX, or POSIX plus extensions like the optional year segment, or something else), and perform additional validation to ensure croniter does not receive formats not supported by Airflow, even if it can handle them. Personally I’m inclined to only do POSIX and reject the sixth segment entirely, at least until croniter fixes its handling to correctly interpret it (see taichino/croniter#76).

Are you willing to submit a PR? Yeah

Related Issues https://github.com/apache/airflow/issues/16107#issuecomment-869559200

16692

taichino/croniter#76

eladkal commented 3 years ago

By addition validation do you mean to remove the 6 segmant and then trust croniter ?

uranusjr commented 3 years ago

I think we should build a simple validator to check for the whole string before passing it into croniter (and hopefully croniter will always work if the string passes our test).

The standard POSIX syntax is fairly easy to parse, plus support for SUN-SAT. Is JAN-DEC a GNU extension? That can be added as well if it works in croniter.

ashb commented 3 years ago

We don't want to restrict to just POSIX cron -- some of the non-POSIX extensions are widely supported and useful 0 0 * * 5#3,L5 for instace.