Deprecate 'airflow dags test' and merge the use case into 'airflow dags backfill' and 'airflow dags trigger'

Description

Currently (in versions up to 2.1.4), airflow dags test <dag_id> <execution_date> creates a backfill run at the specified datetime. This, however, applies regardless of whether the DAG can actually have a logically automated backfill at that specific datetime or not. One example of this logically confusing behaviour is shown in #18473. A DAG with schedule_interval=None should logically have no backfill runs ever, but the test command would still happily create a backfill run at that datetime.

With the introduction of custom timetables in AIP-39, the DAG scheduling logic went through some extensive refactoring to conform more closely to the DAG's schedule/timetable specification. This means that a backfill run can no longer be created at will. The 2.2 release will contain a hack to keep the current behaviour of "free" backfill run creation via test (#18742), but I would prefer this to be a temporary measure and be removed once we have a better solution.

The root cause to this issue is, IMO, airflow dags test has very poor semantic as currently designed. It is entirely non-obvious it is creating backfill runs (and a subsequent airflow dags backfill call would therefore skip the specific datetime if and only if it lies on the logical schedule), nor why a backfill can happen without considering the schedule (it is the only way to do that in Airflow AFAIK). And the name test itself is somewhat a misnomer—why is creating a backfill run a test in the first place?

Use case/motivation

From what I can tell, the currently primary use case to airflow dags test is to check whether a DAG implements the tasks reasonably before it's activated. For this particular use case, the user does not actually care what kind of run is used, so a manual run would do. But we should also create a migration path for those relying on airflow dags test to create a backfill run, since the implied side effect of saving a backfill run later on is also somewhat useful.

So the plan I currently have in mind is:

Add a new flag to airflow dags trigger to allow triggering a manual run and execute it directly in the console (instead of sending it to the scheduler). This will need some new mechanism since trigger is currently implemented by DAG.create_dagrun(). I think we'll need a new job class e.g. ManualRunJob.
Add a new flag to airflow dags backfill to do the same thing, but with a backfill run. This would cover the exact same use case as airflow dags test right now, but with more obvious semantics. The syntax would however be significantly more verbose, we need to work on that as well.
Deprecate airflow dags test since the its usage can be covered by the above two additions.

Related issues

Issue raised against 2.2 beta about the changed behaviour: #18473 PR to "restore" the pre-2.2 behaviour: #18742

Are you willing to submit a PR?

[X] Yes I am willing to submit a PR!

Code of Conduct

[X] I agree to follow this project's Code of Conduct

One comment to that is (via #19578) that users also use (and quite often) airflow tasks test to run single tasks for a particular DAG. I think while we could get-by with backfil on dags test , this will not be the case for tasks test- and we need to figure out how to pass the data_interval to that comand as well. I think from the docs the intention was that infer_data_interval for the custom timetable will do that, but it does not seem to be plugged-in currently (and tasks test with custom timetables fails as it cannot infer the interval).

I think if we solve it for tasks test, the same solution could be used for dags test.

One good thing about dags test also is that it uses "DebugExecutor" and did not leave traces in the DB after it was executed (or so I thought at least - and for sure that was the itention of the dag tests command. I am not sure if this assumption holds true still - or needs to be 'fixed". But the idea was that you run the whole dag this way, but that it will not be stored in the database. I am not sure if this is something that we want to mix with backfill - though. For me dags tests still has a good use here.

apache / airflow