Closed AetherUnbound closed 9 months ago
@AetherUnbound Can I be assigned to this issue?
@AetherUnbound @sarayourfriend Here is a code I tried to implement in catalog/test/conftest/py
.
@pytest.fixture()
def get_test_dag_id():
return ""
@pytest.fixture()
def get_test_pool():
return ""
@pytest.fixture()
def isTaskInstance():
return False
@pytest.fixture()
def isPool():
return False
@pytest.fixture(autouse=True)
def clean_db(get_test_dag_id, get_test_pool, isTaskInstance, isPool):
with create_session() as session:
# synchronize_session='fetch' required here to refresh models
# https://stackoverflow.com/a/51222378 CC BY-SA 4.0
session.query(DagRun).filter(DagRun.dag_id.startswith(get_test_dag_id)).delete(
synchronize_session="fetch"
)
if isTaskInstance:
session.query(TaskInstance).filter(
TaskInstance.dag_id.startswith(get_test_dag_id)
).delete(synchronize_session="fetch")
if isPool:
session.query(Pool).filter(id == get_test_pool).delete()
The goal was to create pytest.fixture on each individual test to override get_test_dag_id
, get_test_pool
, isTaskInstance
, and isPool
global values. I thought clean_db
will re-run again each time another test ran which I'm not sure if it did. It fails when I run just catalog/test
. Am I going on the right direction? How do I make sure clean_db
will trigger every time a new test is being ran?
As far as getting clean_db
to run for each test, what you've got make sense to me. The only thing is if you want it to run after the test, you can yield
at the start of the function. Pytest runs fixture functions before the test, and if they yield, it'll call .next
on the generator after the test is down, sort of like a tear-down step. Not sure if that is the issue though.
I also don't know for sure if it's possible to create local overrides for fixtures the way you're proposing. I think the right approach for your idea (which sounds good to me), is to use the request
fixture to get information from the test's module:
The idea would be to define test_dag_id
, test_pool
, etc, as variables in the module. They can be functions too, if you want. Then retrieve them using something similar to the code in the example in the docs:
test_dag_id = getattr(request.module, "test_dag_id", None)
# probably raise an exception if `test_dag_id` is None?
Also, if clean_db
only needs to run for a subset of tests, like a group of tests that are all in a subdirectory (even if nested), then it should go into a conftest.py
in that directory, so that it doesn't run for tests that don't need it.
def do_execute(self, cursor, statement, parameters, context=None):
> cursor.execute(statement, parameters)
E sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'function'
E [SQL: DELETE FROM dag_run WHERE dag_run.dag_id = %(dag_id_1)s]
E [parameters: {'dag_id_1': <function test_dag_id at 0x7f94651632e0>}]
E (Background on this error at: https://sqlalche.me/e/14/f405)
/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py:736: ProgrammingError
I tried using the request
fixture and there is an error when calling getattr
. I have created a pull request using an override method.(link).
Description
We have a few places in our catalog tests with very similar logic for creating the DAG necessary for running some tests. Examples:
https://github.com/WordPress/openverse/blob/9056029e6c354fe2dbf5b0ad49d60648bd380d27/catalog/tests/dags/common/sensors/test_single_run_external_dags_sensor.py#L23-L34
https://github.com/WordPress/openverse/blob/9056029e6c354fe2dbf5b0ad49d60648bd380d27/catalog/tests/dags/data_refresh/test_dag_factory.py#L22-L25
https://github.com/WordPress/openverse/blob/9056029e6c354fe2dbf5b0ad49d60648bd380d27/catalog/tests/dags/common/test_ingestion_server.py#L21-L31
https://github.com/WordPress/openverse/blob/9056029e6c354fe2dbf5b0ad49d60648bd380d27/catalog/tests/dags/common/sensors/test_utils.py#L18-L21
https://github.com/WordPress/openverse/blob/9056029e6c354fe2dbf5b0ad49d60648bd380d27/catalog/tests/dags/providers/test_provider_dag_factory.py#L27-L37
Although there are some slight differences, it would be ideal if we could find a way to combine these into a single fixture that's used within all the affected tests, rather than redefining the fixture for every test.
Additional context
Came up in discussing https://github.com/WordPress/openverse/pull/2209#discussion_r1232622950