Velir / dbt-ga4

dbt Package for modeling raw data exported by Google Analytics 4. BigQuery support, only.
MIT License
289 stars 128 forks source link

Field name item_params does not exist in STRUCT<item_id STRING, item_name STRING, item_brand STRING, ...> #289

Closed erikverheij closed 1 week ago

erikverheij commented 7 months ago

Hello! I'm running into the following error:

Database Error in model base_ga4__events (models/staging/base/base_ga4__events.sql)
Field name item_params does not exist in STRUCT<item_id STRING, item_name STRING, item_brand STRING, ...>; Did you mean item_brand? at [157:30]
compiled Code at target/run/ga4/models/staging/base/base_ga4__events.sql

Do you have any idea what could be the issue? I've removed all tables from my dataset and tried to rebuild everything from scratch but still run into this error.

This is my project config:

name: bigquery_my-company_analytics
version: 0.0.1
config-version: 2
profile: bigquery_my-company_analytics

vars:
  static_incremental_days: 3
  ga4:
    project: my-company-analytics
    property_ids: [1234567, 2345678]
    dataset: analytics_dbt
    start_date: "20230501"
    conversion_events: ["purchase"]
    transaction_custom_parameters:
      - name: "transaction_id"
        value_type: "int_value"
    cta_click_custom_parameters:
      - name: "label"
        value_type: "string_value"
      - name: "tm_page"
        value_type: "string_value"
      - name: "tm_section"
        value_type: "string_value"
      - name: "tm_cta"
        value_type: "string_value"

models:
  ga4:
    staging:
      recommended_events:
        stg_ga4__event_purchase:
          +enabled: true
        stg_ga4__event_generate_lead:
          +enabled: true

I'm using package version 5.1.0.

If I change start_date to 2023-11-01 it runs without errors. I'm not sure from which date it exactly starts to fail, but it's somewhere between 20230501 and 20231101

dgitis commented 7 months ago

This was caused by Google adding item_params to the items record.

You can fix this by running dbt run --select base_ga4__events --full-refresh.

erikverheij commented 7 months ago

Hi @dgitis, that doesn't do the trick for me :(.

dgitis commented 7 months ago

Every property that I checked got the item_params on the same day, but it is possible that it hasn't been rolled out to your export.

Check the raw BQ export schema for the item_params record.

image

If item_params doesn't exist, in the source data, then you'll have to disable the base model in the models block, copy the base model code to your project and edit out the item_params unnesting in that model.

When it does roll out for you, you can undo all of that and benefit from the custom item parameters.

If that doesn't work, what version of the package are you one? What's in your packages.yml file? Did you maybe forget to run dbt deps before running full refresh on the base model?

adamribaudo-velir commented 7 months ago

As a work-around, you could override the default__base_select_renamed() macro to remove the item_params field.

Strange that your export doesn't have this field, though.

erikverheij commented 7 months ago

Hi guys, thanks for your suggestions!

I discovered that items.item_params doesn't exist in older shards. So for example in events_20230920 the field items.item_params is absent. In the new shards it is available. But I assume that's the case for everyone? Or isn't it?

If that doesn't work, what version of the package are you one? What's in your packages.yml file? Did you maybe forget to run dbt deps before running full refresh on the base model? This is my packages.yml file

packages:
- package: Velir/ga4
version: [">=5.1.0", "<5.2.0"]

I double-checked that I did run dbt deps. I've also verified the version in ./dbt_packages/ga4/dbt_project.yml which is 5.1.0.

As a work-around, you could override the default__base_select_renamed() macro to remove the item_params field. Strange that your export doesn't have this field, though.

Thanks! I hope there's a solution that allows me to stick to the default macro, but it's great to have this as a workaround for now.

3v-dgudaitis commented 6 months ago

I just had to run the command listed in this comment on a multi-site installation to get it working.

https://github.com/Velir/dbt-ga4/issues/280#issuecomment-1780785281

You'll need to install GCloud CLI on your machine, save the JSON file referenced in the comment to run the command.

efrazier commented 6 months ago

Hey Guys,

dbt run --select base_ga4__events --full-refresh

worked fine for me and very much appreciate the work on this!! I did do a full update from my forked repo before running it.

adamribaudo-velir commented 2 months ago

@erikverheij are you all set here?

erikverheij commented 2 months ago

I'm still working with the workaround ( overriding default__base_select_renamed)

Probably the problem is in the fact that items.item_params doesn't exist in older shards. So for example in events_20230920 the field items.item_params is absent. In the new shards, it is available. But I assume that's the case for everyone? Or isn't it?

dgitis commented 2 months ago

I just had to run the command listed in this comment on a multi-site installation to get it working.

#280 (comment)

Did you follow the instructions in the above link? That should add the items.item_params field manually and resolve the error.

erikverheij commented 3 weeks ago

Hi @dgitis, I see I missed your last reply on this.

I didn't run that as it would update the base_ga4__events schema, but in my case the error comes from selecting item_params from the events_* tables.

Also, I completely rebuilt the dataset from scratch (removed alle tables), resulting in the same error.

To me it seems that your older shards like events_20230101 also have the items.item_params field in the schema, otherwise it should fail in your case as well.

Is that the case? If so, then maybe Google updated the schema of the old shards, but failed to do so in our instance.

dgitis commented 3 weeks ago

Also, I completely rebuilt the dataset from scratch (removed alle tables), resulting in the same error.

If you look at the schema for base_ga4__events, does it have items.item_params in it?

If not, then the rebuild didn't work and you'll need to follow the instructions linked earlier.

To me it seems that your older shards like events_20230101 also have the items.item_params field in the schema, otherwise it should fail in your case as well.

There are settings that govern how adding fields to partitioned tables is handled. If sharded tables are similar, then I don't think the old shards should have the new field.

but in my case the error comes from selecting itemparams from the events* tables.

But it's the base_ga4__events model that is doing the selecting. Again, refer back to the schema for that table. If it doesn't have items.item_params, then you need to add it manually using the instructions I linked earlier.

erikverheij commented 2 weeks ago

Thanks for the clarification 👍! I'll give it a try shortly and come back here to share the results!

erikverheij commented 1 week ago

Thanks everyone, running the migration from https://github.com/Velir/dbt-ga4/issues/280#issuecomment-1780785281 did the trick indeed.

erikverheij commented 1 week ago

Closing..