apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
35.21k stars 13.76k forks source link

Optimize startup time for parallel tests #40388

Closed potiuk closed 5 days ago

potiuk commented 6 days ago

When parallell tests start on CI, they are running parallel docker compose's and on a clean machine in CI this means that every parallel test is pulling the images needed to run tests. This means that backend images are pulled in parallell by all starting parallel runs.

This PR optimizes this step - before running the tests in parallel we run docker compose pull once with the same compose files as tests - this will pull the necessary images only once.

It should save a few seconds and save a lot of unnecessary traffic for CI tests - where same image is pulled multiple times - especially for self-hosted runners of ours where we run 8 docker compose instances in parallel.


^ Add meaningful description above Read the Pull Request Guidelines for more information. In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed. In case of a new dependency, check compliance with the ASF 3rd Party License Policy. In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

potiuk commented 5 days ago

There were some strange output on "lowest direct" tests in this one, so I will have to take a closer look before we merge this one :).

potiuk commented 5 days ago

There were some strange output on "lowest direct" tests in this one, so I will have to take a closer look before we merge this one :).

Yeah. I had to disable caching of env variables - because they were always called without TEST_TYPE env variable.