apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.16k stars 14.04k forks source link

Add a check in BigQueryInsertJobOperator to verify the Job state before marking it as success in Airflow #40839

Open kandharvishnu opened 1 month ago

kandharvishnu commented 1 month ago

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

apache-airflow-providers-google==10.12.0

Apache Airflow version

2.7.3

Operating System

debian

Deployment

Astronomer

Deployment details

No response

What happened

When the BigQueryInsertJobOperator submits a job to BigQuery, there are instances where the TaskInstance in Airflow is marked as success rarely before the Job actually completes in BigQuery.

When the executemethod of a TaskInstance’s Operator completes without raising an exception, the TaskInstance is marked as success. Since these TaskInstances are being marked as success, it implies that the BigQueryInsertJobOperator’s execute method is completing without any exceptions. When QueryJob.result returns something, allowing the execute method to finish.

What you think should happen instead

Need to add an additional check here to verify if the Job state is DONE in BigQuery, ensuring that Airflow only marks the TaskInstance as success when the job is indeed complete in BigQuery.

How to reproduce

I haven't been able to replicate the issue. It happens infrequently.

Anything else

There might be a bug in google-cloud-bigquery==3.13.0 that causes the QueryJob.result method to return before the job is actually DONE.

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 1 month ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

RNHTTR commented 1 month ago

Worth noting that even though this is on an older version of the provider, there's no check in main to see whether the job completed in BigQuery. I think this is probably a bug in google-cloud-bigquery==3.13.0, but it still might make sense to check whether the job is complete to avoid this situation completely.