dbt-labs / snowplow

Data models for snowplow analytics.
https://hub.getdbt.com/dbt-labs/snowplow/latest/
Apache License 2.0
126 stars 45 forks source link

Support new snowplow data model #102

Open sphinks opened 3 years ago

sphinks commented 3 years ago

Describe the feature

It looks like Snowplow has switched to the new of processing raw data https://github.com/snowplow/data-models and call new data model as V1. Current implementation of DBT module does not suits it anymore. New data model should be supported/adopted by DBT snowplow module.

Describe alternatives you've considered

Possible workaround - create by DBT module all snowplow tables by parsing JSON fields in EVENTS table. E.g. create table com_snowplowanalytics_snowplow_web_page_1 by parsing field EVENTS.contexts_com_snowplowanalytics_snowplow_web_page_1

Who will this benefit?

All users who has using DBT and Snwoplow at the same time on their projects.

sphinks commented 3 years ago

@jtcohen6 @drewbanin I'd like to get your insights regarding what is the best way to implement workaround? I have implemented workaround models to new snowflake data model and going to do PR. However, the main question how to configure such switch in logic. I see two options: 1) Introduce some special variable (quite easy to implement, but not so clear for user what is it). 2) Just reuse variables like 'snowplow:context:web_page': e.g. value of 'snowplow:events' (atomic.events) = 'snowplow:context:web_page', looks like that we are using snowflake layout with single table and start using workaround solution. (more clear for user, but could be fragile).

sphinks commented 3 years ago

There is an update: Snowplow team is working on new DBT package: https://github.com/snowplow/snowplow/issues/4466