apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.04k stars 14.28k forks source link

GCSToGCSOperator source_objects failing to parse xcom #42391

Closed blaklaybul closed 1 month ago

blaklaybul commented 1 month ago

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

10.15.0

Apache Airflow version

2.7.3

Operating System

debian

Deployment

Astronomer

Deployment details

Using astro runtime 9.15.0

What happened

The GCSToGCSOperator is throwing an error when trying to pass task output into the source_objects field.

 File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/cloud/transfers/gcs_to_gcs.py", line 211, in __init__
    if source_objects and any(WILDCARD in obj for obj in source_objects):
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/airflow/models/xcom_arg.py", line 268, in __iter__
    raise TypeError("'XComArg' object is not iterable")

What you think should happen instead

This operator should be able to handle xcom, and we have no issue using the GCSToBigQueryOperator in the same way

How to reproduce

@task
def return_files():
    files = ['a','b','c','d']
    return files

files = return_files()

 extract_files = GCSToGCSOperator(
        task_id="fetch_data",
        source_bucket="source_bucket_name",
        source_objects=files,
        destination_bucket="dest_bucket_name",
        destination_object="foo",
        dag=dag,
    )

files >> extract_files

Anything else

No response

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 1 month ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.