Open SoftDed opened 8 months ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
I've found a workaround for this problem. I can manually create a session:
def print_triggering_dataset_events(triggering_dataset_events=None, **kwargs):
from airflow.utils.session import create_session
with create_session() as session: <-------------------------------------------------------------------------------------- here
for dataset, dataset_list in triggering_dataset_events.items():
for dataset_event in dataset_list:
print('Task result: ', dataset_event.source_task_instance.xcom_pull(task_ids='test', session=session)) <------- and here
Another observation.
I wrote my own xcom backend. If I use S3Hook and fetch data from Connection in the metadata database, everything breaks. However, if I retrieve data from environment variables, such a 'hack' works.
Hi @utkarsharma2, I would like to give this a try. Could you please assign me?
Hi @SoftDed,
I think the problem happens after xcom_pull
is called. If one collect source_task_instances
first and then calls xcom_pull
, then it works without any issue:
@task
def print_triggering_dataset_events(triggering_dataset_events=None):
task_instances = []
for dataset, dataset_list in triggering_dataset_events.items():
for dataset_event in dataset_list:
task_instances.append(dataset_event.source_task_instance)
for task_instance in task_instances:
print('Task instance: ', task_instance.xcom_pull(task_ids='test'))
I am not sure how to fix this but wanted to share this observation as well.
Apache Airflow version
2.8.0
If "Other Airflow 2 version" selected, which one?
No response
What happened?
When using two or more DataSets as triggers, an error occurs while accessing the
source_task_instance
object ofDatasetEvent
.What you think should happen instead?
It should be possible to access all fields of every
DatasetEvent
.How to reproduce
Create two test DAGs (producer and consumer) and link them with two DataSets. Code:
Operating System
Debian GNU/Linux 12 (bookworm)
Versions of Apache Airflow Providers
Deployment
Docker-Compose
Deployment details
No response
Anything else?
Error in log:
In the UI
With two DataSet - error
With one DataSet - OK
Are you willing to submit PR?
Code of Conduct