Open potiuk opened 1 month ago
@potiuk I am interested in this should I take this on? Might take bit time to workout this, this area new to me , but this will help me to understand more on the airflow ci/cd . :)
@potiuk I am interested in this should I take this on? Might take bit time to workout this, this area new to me , but this will help me to understand more on the airflow ci/cd . :)
It's going to be a little more complex it seems - we likely wil have to redesign a bit the way how tests are executed in CI with different test "groups" now so i will start a wider discussion on slack channel to discuss how to do it and involve others.
@potiuk I am interested in this should I take this on? Might take bit time to workout this, this area new to me , but this will help me to understand more on the airflow ci/cd . :)
It's going to be a little more complex it seems - we likely wil have to redesign a bit the way how tests are executed in CI with different test "groups" now so i will start a wider discussion on slack channel to discuss how to do it and involve others.
Absolutely! I just began analyzing this yesterday to see which components can be grouped or not. I noticed some existing options, like skip-provider-tests, and now that we have the task SDK, we may need to support this for executing tests or providing skipping options as well. I completely agree that some redesign is necessary.
Hey Here. I would like to start a discussion and propose some redesign of our CI - to address some of the issues mentioned here - and involve others so that we come up together with a better design and then we get it implemented together.
First let me state the problems and what is wrong currently. And I will write a separte comment on what would be the first proposal to address it.
Also "system" tests are a separate type - but they are under "tests" in "airflow" and "provider' tests.
Those tests cannot be run together in single pytest command - because all of them have a conftest.py file at the top folder and pytest refuses to run them together because it does not know which conftest.py
is the "top-level" one.
Currently we have "non-db-tests" that runs all the "non-db" tests together in a single xdist/parallel fashion and it currenlty runs together "tests" and "providers/tests" together - but this is really thanks to a workaround for it in "providers" (there is an __init__.py
in "providers" folder that makes the test belong to package "providers.tests" - but this is really a hack because "providers" is where the whole "providers" project is, it's not a python "package".
Both "airflow" and "providers" tests are currently run with breeze testing tests
command (which by default used to run ALL tests ("airflow + providers") and only when you specified particular packages or modules to run as additional arguments woudl run only those tests you specified. This was a bit "hacky" approach how it was done and one of the reasons why running system tests were broken after provider's separation (i temporarily fixed it in #43529 but this is really a band-aid).
Also the same breeze testing tests
command is used to run "helm" unit tests - and there is another "hack" implemented - we are also using breeze container to run the tests but "helm" tests do not need to be run in breeze container at all - they need a small local venv we already have from k8s where helm
is installed - and we can run them there
Actually airflow
and task_sdk
tests technically (eventually when we fully remove all provider references from airflow core and tests) will not have to use breeze container as well. With UV we could technicaly run the tests in a venv created dynamically in CI (with main constraints / uv lock
for reproducibility) rather than using the CI image - because we do not need alll the 700+ deps - they should be "reproducible-enough" without the CI imaage of ours.
We alredy have separate "TaskSDK tests" that are running in the same environment as "Non-DB tests" - but only for "task_sdk" and they have separate breeze testing task-sdk-tests
- it looks like a good direction,
My proposal (high level) - it needs some detailed thinking and prototyping regarding the parallel execution of tests in CI is to follow the breeze testing task-sdk-tests
and introduce three more testing commands:
breeze testing provider-tests
commandbreeze testing helm-tests
commandbreeze testint system-tests
commandThis means that we will loose some of the parallelism benefifts when we run provider
tests and airflow
tests together (but instead we will increase a number of jobs we have in CI, whch we still should be able to do without exceeding the limits that INFRA has on a number of parallel jobs we can run in CI jobs).
We can also gradually move tests to local venvs where they do not need database containers from docker compose and Pytthon / System dependencies that we currently standardise in the CI image.
Also System tests should have a 'dry-run" mode - where we can run them as part of the CI so that we know that they ar e not broken. We are not able to run system tests for all external systems, but at least we should be able to have some "dry-run" system tests that we should be able to run with all PRs.
So high level we have:
System tests are not currently executed.
And we can do it this way:
Likely we could attempt to make TaskSDK and Helm tests to run in venv rather than in CI image even now - I think Task SDK (@kaxil @ashb ?) there are very little dependencies there and we could easily make a local venv there instead of the whole CI image as there are no specific system or provider dependencies there.
Also spliting out tests to "Airflow" and "Providers" without moving them out of CI image will hel us to address the issues we have now with conftest.py and actual "separation" of the different test groups.
When we remove all the remaining references to providers from Airflow we might move Airflow Non-DB tests to local virtualenv.
We switch the providers tests to use task sdk - and at this stage all db tests should be converted to non-db tests.
We will still need provider tests to run in CI image and use the image to calculate constraints etc. - for quite some time (maybe for ever) but I see a clear path and stages we could take to simplify the "Airflow" and Task SDK ones.
When we add the capability of testcontainers (see #43514) the db tests might eventually be run in local virtualenv
We will still need provider tests to run in CI image and use the image to calculate constraints etc. - for quite some time (maybe for ever) but I see a clear path and stages we could take to simplify the "Airflow" and Task SDK ones.
I made a few refinements to the initial proposal
I think Task SDK (@kaxil @ashb ?) there are very little dependencies there and we could easily make a local venv there instead of the whole CI image as there are no specific system or provider dependencies there.
Yes, there should be no deps on a any providers, not even the new standard provider as all of those need to depend on the SDK.
Given the ability of UV to create predictable venvs with the right versions installed, yeah I agree. Also if it's relevant I think every* test in the task SDK will be stateless and could be parallelized - i.e. they are all non-db style tests
Given the ability of UV to create predictable venvs with the right versions installed, yeah I agree. Also if it's relevant I think every* test in the task SDK will be stateless and could be parallelized - i.e. they are all non-db style tests
Yeah and starting to have it in a freshly created venv without any DB or whatever deps for Task SDK is actually even GOOD because there we will not have even accidental dependencies that we missed.
Now we have three test folders for unit tests: "tests", "providers/tests", "task_sdk/tests" Do we want to include
helm_tests/
in that list to make it 4?Also the same breeze testing tests command is used to run "helm" unit tests
Also system tests are run by the breeze testing tests command as of now which caused some confusion and issues recently. I think that's worth having a bullet point for in your explanation comment.
and introduce two more testing commands:
Again, should we make system tests a top level command as well (breeze testing system-tests
)? There wouldn't be much there now but I think it's a nice future proof solution. Because right now the system test stuff works in a very messy way, and I think making it a top level command would add much needed separation.
Otherwise I like the plan!
Also system tests are run by the breeze testing tests command as of now which caused some confusion and issues recently. I think that's worth having a bullet point for in your explanation comment.
Agree. Will Update the plan accordingly.
I updated it. I added something that was long time overdue - "dry run system tests". While we wil not be able to run all the system tests from Amazon/Google/Astronomer/Teradata/Microsoftt in the future, we should be able to have some "canonical" non-externally-bound system tests that we should run to "mimic" what Amazon/Google/Astronomer/Teradata/Microsoft in the future will be doing to make sure that at least "execution framework" works.
I updated it following the discussion on slack - adding the stage where we get rid of all db and only use task sdk in all provider tests (stage 3)
@potiuk Going to attempt this one stage 1: Providers Non-DB tests
over this weekend, would you have an suggestions or thoughts for this, at present am thinking just have similar setup like task_sdk as mentioned above it is good example direction?
I updated it following the discussion on slack - adding the stage where we get rid of all db and only use task sdk in all provider tests (stage 3)
yeah that make sense. :)
@potiuk Going to attempt this one stage 1:
Providers Non-DB tests
over this weekend, would you have an suggestions or thoughts for this, at present am thinking just have similar setup like task_sdk as mentioned above it is good example direction?
Sorry - only see it now (Hackathon) I thought about having completely separate test-
command for each test type.
breeze testing core-tests-db
breeze testing core-tests-non-db
breeze testing providers-tests-db
breeze testing providers-tests-non-db
breeze testing providers-task-sdk
breeze testing system
Then each of the tests will have it's own (specific for the test command) set of test types it can use - core tests will have all the "airflow sub-tests", providers will be able to use Providers
or Providers[google]
or Providers[-amazon,-google]
as they can do today. Task-sdk will not have any type. System tests will have the type based on which provider system tests they are on.
Doing it this way has the nice effect that you can easily parallelise them as needed following our paralllel framework but also when you enter breeze with breeze
command you should be able to run any test or group of tests by just pytest tests/*
or pytest providers/tests/*
(both DB and non-DB tests should also work when DB is available). Test type would only be used when you want to run a predefined "set" of tests from outside.
I actually have a draft version of this in my local branch, already and I wanted to make a PR today - so if you have not started on it, I might want to set it up for review for you :D
@potiuk Going to attempt this one stage 1:
Providers Non-DB tests
over this weekend, would you have an suggestions or thoughts for this, at present am thinking just have similar setup like task_sdk as mentioned above it is good example direction?Sorry - only see it now (Hackathon) I thought about having completely separate
test-
command for each test type.
breeze testing core-tests-db
breeze testing core-tests-non-db
breeze testing providers-tests-db
breeze testing providers-tests-non-db
breeze testing providers-task-sdk
breeze testing system
Then each of the tests will have it's own (specific for the test command) set of test types it can use - core tests will have all the "airflow sub-tests", providers will be able to use
Providers
orProviders[google]
orProviders[-amazon,-google]
as they can do today. Task-sdk will not have any type. System tests will have the type based on which provider system tests they are on.Doing it this way has the nice effect that you can easily parallelise them as needed following our paralllel framework but also when you enter breeze with
breeze
command you should be able to run any test or group of tests by justpytest tests/*
orpytest providers/tests/*
(both DB and non-DB tests should also work when DB is available). Test type would only be used when you want to run a predefined "set" of tests from outside.I actually have a draft version of this in my local branch, already and I wanted to make a PR today - so if you have not started on it, I might want to set it up for review for you :D
No worries :). super thats great if you have it already, please go ahead, have not yet started, actually was looking over the weekend, backporting and testcontainer things.
Testcontainers is incredibly helpful for reducing the need for numerous Docker files and configurations. It’s straightforward to use (we can define some test kind of interfaces to extend or implement to any testconatainer in our tests) —I experimented with Kafka and Airflow, and while it’s not yet fully functional setup, I managed to get it to a point where it connects a Kafka Testcontainer and runs a DAG from my local setup. I think this is definitely worth using. will try this runnig with GHA to see how it performs :)
Testcontainers is incredibly helpful for reducing the need for numerous Docker files and configurations. It’s straightforward to use (we can define some test kind of interfaces to extend or implement to any testconatainer in our tests) —I experimented with Kafka and Airflow, and while it’s not yet fully functional setup, I managed to get it to a point where it connects a Kafka Testcontainer and runs a DAG from my local setup. I think this is definitely worth using. will try this runnig with GHA to see how it performs :)
cool!
Currently
non-db-tests
execute all test type specified via--test-types
in a single process using xdist (-n processors
) parallelism in order to make efficient use of parallelisation of the tests. This means that both "core" and "providers" types of tests are run in a singlepytest
command. For example:Where PARALLELL_TEST_TYPES are produced by the "selective check" to determine which tests should be run.
However the #42505 splits out providers to a separate directory and we cannot run providers and non-provider tests any longer in the same process, because pytest has a limitation where test module names should not overlap (and tests/conftest.py is different for providers and non-providers, so they cannot be run in the same process as it will result in
One solution to that (probably best approach) is to introduce a required flag (
--providers/--no-providers
) fornon-db-tests
and filter out / core provider test folders when respective flag is used, and run two separate jobs in our CI - one for providers and one for core.That would require to have to separate jobs in
ci.yml
. "Non-db Core tests", "Non-db Providers tests" - each of them passing the flag down torun-unit-tests.yml
composite workflow.