Closed arpit-maheshwari1 closed 1 month ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
Looking into this one, seems received message to JSON-serializable dicts conversion not there when deferrable = True. Let me check and update.
Apache Airflow Provider(s)
google
Versions of Apache Airflow Providers
No response
Apache Airflow version
2.9.1
Operating System
Google Cloud Composer 2 (managed Airflow environment on Google Cloud)
Deployment
Google Cloud Composer
Deployment details
No response
What happened
I'm using the
PubSubPullSensor
in Apache Airflow withdeferrable=True
on Google Cloud Composer 2. When the sensor is set todeferrable=True
, the messages stored in XCom have a different format compared to whendeferrable=False
.When
deferrable=True
:The messages are stored in XCom as strings that resemble serialized protobuf format:
When
deferrable=False
:The messages are stored in XCom as standard Python dictionaries:
This difference in formats causes issues when processing the messages downstream, as the structure is inconsistent and the protobuf-like strings require additional parsing.
What you think should happen instead
The expected behavior is that the messages should be stored in XCom in a consistent format, regardless of whether
deferrable=True
ordeferrable=False
is set, ideally as a standard Python dictionary.How to reproduce
Steps to Reproduce:
PubSubPullSensor
withdeferrable=True
anddeferrable=False
.You can reproduce this issue using the following DAG:
Note:
"wait_for_message"
task to see how the message is stored.project_id
andtopic_name
in the code with your actual Google Cloud project ID and Pub/Sub topic name.Anything else
No response
Are you willing to submit PR?
Code of Conduct