fivetran / dbt_netsuite_source

Data models for Fivetran's Netsuite source package, built using dbt.
https://fivetran.github.io/dbt_netsuite_source/
Apache License 2.0
13 stars 19 forks source link

[BUG] Using `ephemeral` materialization causes all tables to have all null records #33

Open jaredx435k2d0 opened 1 year ago

jaredx435k2d0 commented 1 year ago

Is there an existing issue for this?

Describe the issue

Due to #32, I'm considering ephemeral as the materialization type for netsuite_source, so that I can then create my own staging schema that has all of the tables my connector is bringing in.

The only alternatives I can think of is to try to put them in the same schema, which would cause name conflicts, or to have two staging schemas for the same source, which will be confusing for everyone. Since nobody will directly use the staging models from netsuite_source anyway, I figure it's ok to abstract / hide them away in ephemeral models.

When I try to do this, all fields for all records for all models are null. When I look in my target folder, I can see the generated SQL is explicitly using null as the value for everything.

Is there a way I can use ephemeral as the materialization for netsuite_source?

Relevant error log or model output

target/compiled/netsuite_source/models/netsuite2/stg_netsuite2__transaction_lines.sql:


with  __dbt__cte__stg_netsuite2__transaction_lines_tmp as (

select * 
from [my_db_name_redacted].netsuite_suiteanalytics.transactionline
),base as (

    select * 
    from __dbt__cte__stg_netsuite2__transaction_lines_tmp
),

fields as (

    select

    cast(null as boolean) as 

    _fivetran_deleted

 , 
    cast(null as TIMESTAMP) as 

    _fivetran_synced

 , 
    cast(null as INT) as 

    id

 , 
    cast(null as INT) as 

    transaction

 , 
    cast(null as INT) as 

    linesequencenumber

 , 
    cast(null as TEXT) as 

    memo

 , 
    cast(null as INT) as 

    entity

 , 
    cast(null as INT) as 

    item

 , 
    cast(null as INT) as 

    class

 , 
    cast(null as INT) as 

    location

 , 
    cast(null as INT) as 

    subsidiary

 , 
    cast(null as INT) as 

    department

 , 
    cast(null as TEXT) as 

    isclosed

 , 
    cast(null as TEXT) as 

    isbillable

 , 
    cast(null as TEXT) as 

    iscogs

 , 
    cast(null as TEXT) as 

    cleared

 , 
    cast(null as TEXT) as 

    commitmentfirm

 , 
    cast(null as TEXT) as 

    mainline

 , 
    cast(null as TEXT) as 

    taxline

    from base
),

final as (

    select
        _fivetran_synced,
        id as transaction_line_id,
        transaction as transaction_id,
        linesequencenumber as transaction_line_number,
        memo,
        entity as entity_id,
        item as item_id,
        class as class_id,
        location as location_id,
        subsidiary as subsidiary_id,
        department as department_id,
        isclosed = 'T' as is_closed,
        isbillable = 'T' as is_billable,
        iscogs = 'T' as is_cogs,
        cleared = 'T' as is_cleared,
        commitmentfirm = 'T' as is_commitment_firm,
        mainline = 'T' as is_main_line,
        taxline = 'T' as is_tax_line

        --The below macro adds the fields defined within your transaction_lines_pass_through_columns variable into the staging model

    from fields
)

select * 
from final

Expected behavior

I would expect the CTEs to use the actual fields rather than nulls.

dbt Project configurations

models:

  +persist_docs:
    relation: true
    columns: true

  +copy_grants: true

  netsuite_source:
    +schema: stg_netsuite_suiteanalytics
    +materialized: ephemeral

vars:
  netsuite_data_model: netsuite2
  netsuite2__multibook_accounting_enabled: false
  netsuite2__using_vendor_categories: true

Package versions

packages:
  # - package: calogica/dbt_date
  #   version: 0.7.2 
  #   included as dependency in fivetran/zendesk

  # - package: dbt-labs/dbt_utils
  #   version: 1.0.0 
  #   included as dependency in fivetran/zendesk

  # - package: dbt-labs/dbt_project_evaluator
  #   version: 0.3.0

  # - package: calogica/dbt_expectations
  #   version: 0.8.2

  - package: dbt-labs/codegen
    version: 0.9.0

  - package: fivetran/zendesk
    version: [">=0.10.0", "<0.11.0"]
    # includes dependencies to calogica/dbt_date and dbt-labs/dbt_utils

  - package: fivetran/netsuite
    version: [">=0.7.0", "<0.8.0"]
    # includes dependencies to fivetran/netsuite_source, fivetran/fivetran_utils, dbt-labs/dbt_utils, and dbt-labs/spark_utils

What database are you using dbt with?

snowflake 7.12.3

dbt Version

Core:
  - installed: 1.4.5
  - latest:    1.4.5 - Up to date!

Plugins:
  - snowflake: 1.4.2 - Up to date!

Additional Context

If I switch materialized to view then it all works great.

I'm using Python 3.10.10.

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 1 year ago

HI @jaredx435k2d0 thanks for opening this issue.

Unfortunately with how all of our source packages are created, you will not be able to materialize the staging models as ephemeral. They will either need to be views or tables.

This is particularly necessary due to this cte that exists in all of our staging models. This is required to ensure users of the package see a successful run regardless if they have all the fields in the source that we are querying in the staging model. In order for this to work we leverage the adapter.get_columns function in dbt. Unfortunately, this function requires the model it is reference and the model itself to be a view or table.

As this is an integral part to our package infrastructure we likely will not make this update to the package. Although I understand you desire and wish we could have the adapter function work for ephemeral models. If you wish to get around this, I would recommend forking the package and removing that cte. This way you can take advantage materializing as ephemeral while avoiding the macro if you in fact have all of the columns. Otherwise, you can override the models and add your own custom models that do not leverage this cte.