fivetran / dbt_stripe

Data models for Stripe built using dbt.
https://fivetran.github.io/dbt_stripe/
Apache License 2.0
30 stars 31 forks source link

[Bug] Error in tmp models due to metadata type json #64

Open pietrotansini opened 1 year ago

pietrotansini commented 1 year ago

Is there an existing issue for this?

Describe the issue

Tmp models returns an error because cannot cast metadata dimension which is passed as json type instead of string. Metadata dimension used to be string but most recent Fivetran tables turned it into json.

Relevant error log or model output

Database Error in model stg_stripe__invoice_line_item_tmp (models/tmp/stg_stripe__invoice_line_item_tmp.sql)
  Invalid cast from JSON to STRING at [94:26]
  compiled Code at target/run/stripe_source/models/tmp/stg_stripe__invoice_line_item_tmp.sql

Expected behavior

The tmp models should cast metadata fields as string with to_json() function to convert the value to a JSON string before casting it to a string.

The package should generate the output tables without errors.

dbt Project configurations

name: 'dbzaps' version: '1.0.0' config-version: 2

This setting configures which "profile" dbt uses for this project.

profile: 'default'

These configurations specify where dbt should look for different types of files.

The source-paths config, for example, states that models in this project can be

found in the "models/" directory. You probably won't need to change these!

model-paths: ["models"] analysis-paths: ["analyses"] test-paths: ["tests"] seed-paths: ["seeds"] macro-paths: ["macros"] snapshot-paths: ["snapshots"]

target-path: "target" # directory which will store compiled SQL files clean-targets: # directories to be removed by dbt clean

Configuring models

Full documentation: https://docs.getdbt.com/docs/configuring-models

models:

Applies to all files under models/bilateralstimulation/

dbzaps: bilateralstimulation: staging: materialized: view intermediate: materialized: ephemeral marts: +materialized: table

Fivetran Stripe | Bilateral Stimulation

vars: stripe_union_schemas: ['stripe_bilateralstimulation_de','stripe_bilateralstimulation_us','stripe_bilateralstimulation'] # use this if the data is in different schemas/datasets of the same database/project

Package versions

packages:

What database are you using dbt with?

bigquery

dbt Version

dbt Version: 1.4 (latest)

Additional Context

Screenshot 2023-05-17 at 08 15 29

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 1 year ago

Hi @pietrotansini thanks for opening this issue!

We have actually seen a number of these issues sprout up over the last week across all our Fivetran dbt packages. It seems to be due to newer BigQuery driver that handles JSON datatypes differently. My team and I are currently investigating and are hoping to respond soon with a possible fix.

Be sure to follow this issue for more updates!

fivetran-joemarkiewicz commented 1 year ago

Hi @pietrotansini I have been able to explore this a bit further and have been able to identify the core issue. It seems new connectors set up against a BigQuery destination are syncing JSON fields as a JSON datatype; however, previous connectors are syncing JSON fields as a STRING datatype. As such, when the package is attempting to union the tables across schemas it is finding conflicting datatypes for the metadata field (as some are STRING type and others are JSON type).

Unfortunately there is not much we can do within the dbt package itself. Especially as we are unable to do a JSON to STRING or STRING to JSON cast. My recommendation at the moment would be to open a support ticket to discuss the issue you are seeing with our support team as they will be able to help with any next steps in triaging the datatype mismatch you are seeing.

pietrotansini commented 1 year ago

Thank you Joe for the details. I'll open a ticket with your support team to understand how we can deal with the different datatype