When using Airflow's SQLExecuteQueryOperator with Trino, a problem arises when do_xcom_push is set to False. The operator's result processing, which is tied to XComs, is skipped in this case. This violates Trino's client protocol, which requires the client to retrieve the result set before Trino can continue processing.
Because Airflow doesn't fetch the results, Trino cancels the query, returning a USER_CANCELED error, even if the underlying data modification (like an INSERT) was partially successful. Airflow incorrectly reports the task as successful despite the Trino cancellation.
This issue appears specific to Trino and doesn't affect other databases used with the same operator.
The proposed solution is to create a dedicated TrinoOperator that directly handles Trino's protocol. Instead of relying on XComs, TrinoOperator would fetch all results in default, allowing developers to decide how to use them. Im about to open this PR soon.
Feedback and suggestions on the proposed TrinoOperator solution (or an alternative one) would be greatly appreciated.
What you think should happen instead
The proposed solution is to create a dedicated TrinoOperator that directly handles Trino's protocol.
How to reproduce
A test case confirms this behavior: an INSERT query succeeded in writing data to the Trino table, but the Trino query itself was canceled due to Airflow not retrieving the confirmation response. This resulted in a USER_CANCELED error on the Trino side, even though the data was inserted. The task was still marked as successful in Airflow.
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
Apache Airflow Provider(s)
common-sql
Versions of Apache Airflow Providers
Latest
Apache Airflow version
Latest
Operating System
Linux
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
When using Airflow's SQLExecuteQueryOperator with Trino, a problem arises when do_xcom_push is set to False. The operator's result processing, which is tied to XComs, is skipped in this case. This violates Trino's client protocol, which requires the client to retrieve the result set before Trino can continue processing. Because Airflow doesn't fetch the results, Trino cancels the query, returning a USER_CANCELED error, even if the underlying data modification (like an INSERT) was partially successful. Airflow incorrectly reports the task as successful despite the Trino cancellation.
Trino client protocol [https://trino.io/docs/current/client/client-protocol.html#client-protocol]
This issue appears specific to Trino and doesn't affect other databases used with the same operator. The proposed solution is to create a dedicated TrinoOperator that directly handles Trino's protocol. Instead of relying on XComs, TrinoOperator would fetch all results in default, allowing developers to decide how to use them. Im about to open this PR soon.
Feedback and suggestions on the proposed TrinoOperator solution (or an alternative one) would be greatly appreciated.
What you think should happen instead
The proposed solution is to create a dedicated TrinoOperator that directly handles Trino's protocol.
How to reproduce
A test case confirms this behavior: an INSERT query succeeded in writing data to the Trino table, but the Trino query itself was canceled due to Airflow not retrieving the confirmation response. This resulted in a USER_CANCELED error on the Trino side, even though the data was inserted. The task was still marked as successful in Airflow.
Anything else
No response
Are you willing to submit PR?
Code of Conduct