dbt-labs / snowplow

Data models for snowplow analytics.
https://hub.getdbt.com/dbt-labs/snowplow/latest/
Apache License 2.0
126 stars 45 forks source link

Rm unique test event_id in base events #73

Closed jtcohen6 closed 4 years ago

jtcohen6 commented 4 years ago

Given:

I think it should be our position that analysts should not deduplicate their raw events before feeding them into the Snowplow package. This is an expensive operation, especially on a dataset this large, and it's not really in keeping with the paradigms of "bigger data" platforms (BQ et al).

To that end, we should disable the unique test on snowplow_base_events.event_id.