Open bdtoole opened 2 days ago
Thanks for opening this Feature Request @bdtoole!
As discussed in the slack conversation you linked above, this is something that should be relatively straightforward to add to the package. However, this is not something our team would likely pick up in the short term unless we see increased community interest in this PR. That being said, since it seems you're open to creating a PR I would be happy to share the steps necessary to provide union_schema support for this package. You can then contribute the changes in a PR which our team at Fivetran would be able to review and merge into the next release in a quicker turnaround.
If that is of interest to you, or anyone else who comes across this thread and is interested in contributing the union schema feature then you can view the required steps below:
models/tmp
folder which will include a single model titled stg_mixpanel__event_tmp
.{{
fivetran_utils.union_data(
table_identifier='event',
database_variable='mixpanel_database',
schema_variable='mixpanel_schema',
default_database=target.database,
default_schema='mixpanel',
default_variable='event',
union_schema_variable='mixpanel_union_schemas',
union_database_variable='mixpanel_union_databases'
)
}}
{{ fivetran_utils.source_relation(
union_schema_variable='mixpanel_union_schemas',
union_database_variable='mixpanel_union_databases')
}}
source_relation
as a new field here and anywhere else we explicitly select fields in the models.source_relation
as a condition. So here you will need to include the following to ensure joins are only performed on the relevant data:
join user_first_events
on spine.date_day >= user_first_events.first_event_day -- each user-event_type will a record for every day since their first day
and spine.source_relation = user_first_events.source_relation
The above steps should be all that's required. There are additional changes and some improvements we need to apply as well, but our team can handle that during the PR review if you choose to open one. Let me know if you're interested and if you have any other questions. Thanks!
Is there an existing feature request for this?
Describe the Feature
We have multiple Fivetran connectors for mixpanel that ultimately result in multiple source schemas for our mixpanel data. This package currently does not support the
union_schemas
functionality, which means if I want to use the package for our mixpanel data, I can only use it for one schema and I'd have to manually define the second schema and model the data myself.How would you implement this feature?
The feature has effectively been built out already - it just hasn't been applied to this package, so the necessary changes would need to be made to support it here.
Describe alternatives you've considered
No response
Are you interested in contributing this feature?
Anything else?
For reference, here's a Slack conversation I had with @fivetran-joemarkiewicz about this.