google / orchestra

Advertising Data Lakes and Workflow Automation
Apache License 2.0
48 stars 27 forks source link

Google Analytics storage_name_object not templated #19

Closed jvschoen closed 4 years ago

jvschoen commented 4 years ago

When trying to use the new google_analytics operators, the documentation shows that the parameter storage_name_object is templated, however when I pass the following:

import_audience = GoogleAnalyticsDataImportUploadOperator(
        task_id='import_audience_list_{}'.format(destination_info['account_name']),
        storage_bucket=walden_config['result_bucket_name'],
        storage_name_object="audience-{{ds_nodash}}.csv",
        account_id=destination_info['account_id'],
        web_property_id=destination_info['web_property_id'],
        custom_data_source_id=destination_info['custom_data_source_id'],
        mime_type='application/octet-stream',
        api_version='v3',
        api_name='analytics',
        gcp_conn_id='google_cloud_default',
        data_import_filename='audience-{{ds_nodash}}.csv',
        dag=dag
    )

The dag fails with the following error:

HttpError 404 when requesting https://storage.googleapis.com/storage/v1/b/ <my_bucket> /o/audience-%7B%7Bds_nodash%7D%7D.csv?alt=media returned "Not Found">

I've removed my bucket name here, but it is correct in the url

It looks like it's not recognizing the templated {{ds_nodash}} here.

We are running v1.10.2-composer

jvschoen commented 4 years ago

I added template_fields = ['storage_name_object'] to the GoogleMarketingPlatformBaseOperator which resolves the issue

ceoloide commented 4 years ago

template_fields = ['storage_name_object'] is missing from GoogleAnalyticsDataImportUploadOperator and GoogleAnalyticsModifyFileHeadersDataImportOperator, despitestorage_name_object being documented as templated.

I will add the missing template_fields lists, and add to the template parameters storage_bucket and data_import_filename.

jvschoen commented 4 years ago

I also noticed that my files are being uploaded with "Unknown Filename". I also went through the GoogleAnalyticsDataImportUploadOperator class and saw that the data_import_filename parameter is never called during the upload_file stage.

ceoloide commented 4 years ago

Fixed in 2.1.1