Closed millin closed 3 months ago
Thank you, @millin . I will try to take a look today and get back to you with what I find. I am surprised the serialized_dag
is cleared with every call to db migrate
. I thought it is basically a no-op in case the DB is already up to date.
I am also surprised, however the defaults of reserialize_dags
argument is True
:
https://github.com/apache/airflow/blob/f0ef69198ec0b7ad0c489cbccf76f6130445fedf/airflow/cli/cli_config.py#L659-L666
I can suggest a quick fix
--- await run_command("airflow db migrate", env=environ)
+++ await run_command("airflow db migrate --no-reserialize-dags", env=environ)
But I'm not sure it won't cause problems when updating the Airflow version
Yeah, I also noticed the reserialize_dags
argument. In addition to not being super comfortable about using an argument which is not documented, I also think the re-serialization is required if we are doing a version upgrade, which is possible with MWAA in case of version upgrade. Ideally, Airflow shouldn't be doing the re-serialization if the DB is already migrated, and we should probably report this as a bug in Airflow. However, for MWAA, we will probably have to change the code to do the check ourselves and avoid calling db migrate
if the DB is already initialized.
@millin , this should fix the issue you reported: https://github.com/aws/amazon-mwaa-docker-images/pull/125. I am currently out of office, but MWAA developers should pick the PR and merge it internally. Feel free to ping them if you don't hear any response, or alternatively reach out to AWS support to ensure the team stays on top of this issue, as it is pretty important and might impact multiple customers without them necessarily noticing.
The fix has been deployed to all regions. Customers can trigger environment update to receive the latest image.
Describe the bug The
serialized_dag
table is cleared when every new instance (e.g. worker) is started.This results in the following errors:
trigger_dag
is callingDAG <dag_id> seems to be missing from DagBag
error in webserverThe error occurs very often because when starting a new image (e.g. for new worker) each time the
airflow db migrate
command is called fromentrypoint
, which in turn runsreserialize_dags
and clears the wholeserialized_dag
table.Originally posted by @millin in https://github.com/apache/airflow/issues/40082#issuecomment-2253016337