apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
35.64k stars 13.88k forks source link

Add missing test modules in our unit test suite #35442

Open potiuk opened 8 months ago

potiuk commented 8 months ago

See test_providers_modules_should_have_tests in tests/always/test_project_structure.py but those tests are missing currently:

Provider amazon

codecov

Provider apache.cassandra

codecov

Provider apache.drill

codecov

Provider apache.druid

codecov

Provider apache.hdfs

codecov

Provider apache.hive

codecov

Provider apache.kafka

codecov

Provider celery

codecov

Provider cncf.kubernetes

codecov

Provider common.io

codecov

Provider databricks

codecov

Provider dbt.cloud

codecov

Provider docker

codecov

Provider elasticsearch

codecov

Provider google

codecov

Provider microsoft.azure

codecov

Provider mongo

codecov

Provider openlineage

codecov

Provider presto

codecov

Provider redis

codecov

Provider slack

codecov

Provider snowflake

codecov

Provider trino

codecov

Committer

potiuk commented 8 months ago

cc: @eladkal

potiuk commented 8 months ago

@eladkal -> first test removed from the list with #35457 -> I hope you can bring more people to contribute here :)

potiuk commented 8 months ago

Also related to #35127

potiuk commented 8 months ago

cc: @ephraimbuddy -> this might guide people on where to add tests to cover whole modules that are not "targeted" for tests (they might still be accidentally tested by other modules but they do not have dedicated module tests) - this might help people to make decision on where to add some tests. Also this is a nice list of "todos"

Taragolis commented 8 months ago

tests/providers/slack/notifications/test_slack_notifier.py

I think how to exclude case when the original module airflow.providers.slack.notifications.slack_notifier is deprecated

Taragolis commented 8 months ago

And just add here previous investigation about missing tests https://github.com/apache/airflow/pull/28459#discussion_r1054708003, non all of them actually missing:

  • some of tests moved into tests.integrations
  • Amazon links tested in one go in same module
  • Some of amazon tests split across different test modules, for example: test_s3_bucket.py + test_s3_bucket_tagging.py + test_s3_file_transform.py + test_s3_list.py + test_s3_list_prefixes.py + test_s3_object.py but module named as s3.py (merged version)
Taragolis commented 8 months ago

And codecov statistic also might help to found untested parts: https://app.codecov.io/gh/apache/airflow/tree/main/airflow%2Fproviders

eladkal commented 8 months ago

tests/providers/slack/notifications/test_slack_notifier.py

I think how to exclude case when the original module airflow.providers.slack.notifications.slack_notifier is deprecated

In similar cases we used to have deprecated classes list and we removed it from the output

potiuk commented 8 months ago

I think how to exclude case when the original module airflow.providers.slack.notifications.slack_notifier is deprecated

Absolutely - I think there are many false-negatives there, so removal them by finding and automated way and improving the test case is absolutely legitimate way of cleaning some of those. PRs are most welcome :). I just revived the test really quickly - also in order to not allow more of new ones creep in, but I think we should definitely remove some of those from the list by making the test "smarter"

And codecov statistic also might help to found untested parts: https://app.codecov.io/gh/apache/airflow/tree/main/airflow%2Fproviders

Oh absolutely - this is why I added my comment https://github.com/apache/airflow/issues/35442#issuecomment-1793747405 and updated https://github.com/apache/airflow/issues/35127 with the comment that these two are connected. I see that as two sides of the same coing - eventually they should converge - the missing coverage should quite closely show similar list as the one generated here. Currently they are far-off - for the reasons you explained, but also because some of the module's code are tested by some other tests. - but this is precisely I think what should guide us in making the test smarter - eventually they should show the same result.

And just add here previous investigation about missing tests https://github.com/apache/airflow/pull/28459#discussion_r1054708003, non all of them actually missing:

some of tests moved into tests.integrations Amazon links tested in one go in same module Some of amazon tests split across different test modules, for example: test_s3_bucket.py + test_s3_bucket_tagging.py + test_s3_file_transform.py + test_s3_list.py + test_s3_list_prefixes.py + test_s3_object.py but module named as s3.py (merged version)

Yeah - why don't we automate that in the test then? I think some of them can be easily automatically detected.

Taragolis commented 8 months ago

If you don't mind I've split this list by providers

potiuk commented 8 months ago

If you don't mind I've split this list by providers

Not at all ! Very helpful I think :)

potiuk commented 8 months ago

Super useful with codecov links BTW! Good idea.

mateuslatrova commented 3 months ago

Hi! I found that the classes present in the following modules are already covered with tests, but they are unchecked in the issue's first comment:

tests/providers/amazon/aws/operators/test_emr.py tests/providers/amazon/aws/operators/test_sagemaker.py tests/providers/amazon/aws/sensors/test_emr.py tests/providers/amazon/aws/triggers/test_athena.py tests/providers/amazon/aws/triggers/test_batch.py tests/providers/amazon/aws/triggers/test_emr.py tests/providers/amazon/aws/triggers/test_glue_crawler.py tests/providers/amazon/aws/triggers/test_lambda_function.py tests/providers/amazon/aws/triggers/test_rds.py tests/providers/amazon/aws/triggers/test_redshift_cluster.py tests/providers/amazon/aws/utils/test_rds.py tests/providers/amazon/aws/utils/test_sagemaker.py tests/providers/amazon/aws/utils/test_sqs.py tests/providers/amazon/aws/utils/test_tags.py tests/providers/amazon/aws/waiters/test_base_waiter.py tests/providers/amazon/aws/triggers/test_step_function.py

If it is possible, it would be nice if anyone could check them, so that who is looking to contribute can find missing tests more easily. Thanks!

CC: @potiuk @Taragolis

Taragolis commented 2 months ago

@mateuslatrova Some test modules from your list do not exists or named differently

Exists

I've marked this as completed

Not Exists

mateuslatrova commented 2 months ago

@mateuslatrova Some test modules from your list do not exists or named differently

Exists

I've marked this as completed

Not Exists

Hi @Taragolis ! Thanks for your help.

I guess I was not so clear, but the corresponding code of each of that non-existing files is covered in other test files, and that is why I put those ones on the list.

For example, one test file from the above list that does not exist is tests/providers/amazon/aws/operators/test_emr.py. This test file would cover the code present in airflow/providers/amazon/aws/operators/emr.py , right? But this code is being covered in other test files:

So the tests/providers/amazon/aws/operators/test_emr.py should not be in that list.

Taragolis commented 2 months ago

There is pretty simple rule there if providers module placed into the airflow/providers/amazon/aws/operators/emr.py then tests should be placed into the tests/providers/amazon/aws/operators/test_emr.py.

Original check was broken for a while (1-2 years or so), so we have about 100 modules which do not follow this one.