While running basically the same model in parallel to smooth the transition between models, I noticed that building the models from a fct_ga4__event_page_view model that I created was 40 times more efficient than from stg_ga4__event_page_view.
We build from stg_ga4__event_page_view in a number of places which are all opportunities for a performance improvement. This doesn't take into account caching, but I haven't noticed much of a cache effect from model to model (cache clearly helps when re-using fields within a single model).
However, I only have anecdotal evidence of the little cache re-use between models, so I'd like to see evidence in either respect.
I see us having two options here:
We make fct_ga4__event_page_view a core package model
We conditionally build from fct_ga4__event_page_view if it is present; otherwise falling back to stg_ga4__event_page_view
The first option seems simpler. It has the advantage that enabling and disabling package models is a pattern that we've used elsewhere, so it's not a big leap to expect users to disable the fct_ga4__event_page_view model. However, I find myself customizing this model a lot for each client so we'd be creating a model that almost always gets disabled.
The second option requires greater complexity in the package models. I suspect many people will create fct_ga4__event_page_view models that don't error without consulting any documentation and the errors messages should make it fairly clear what is missing from any fct_ga4__event_page_view model created by users.
Is this something that we should pursue?
Is caching more effective than I give it credit?
Do you have any preferences for either of these methods?
While running basically the same model in parallel to smooth the transition between models, I noticed that building the models from a
fct_ga4__event_page_view
model that I created was 40 times more efficient than fromstg_ga4__event_page_view
.We build from
stg_ga4__event_page_view
in a number of places which are all opportunities for a performance improvement. This doesn't take into account caching, but I haven't noticed much of a cache effect from model to model (cache clearly helps when re-using fields within a single model).However, I only have anecdotal evidence of the little cache re-use between models, so I'd like to see evidence in either respect.
I see us having two options here:
fct_ga4__event_page_view
a core package modelfct_ga4__event_page_view
if it is present; otherwise falling back tostg_ga4__event_page_view
The first option seems simpler. It has the advantage that enabling and disabling package models is a pattern that we've used elsewhere, so it's not a big leap to expect users to disable the
fct_ga4__event_page_view
model. However, I find myself customizing this model a lot for each client so we'd be creating a model that almost always gets disabled.The second option requires greater complexity in the package models. I suspect many people will create
fct_ga4__event_page_view
models that don't error without consulting any documentation and the errors messages should make it fairly clear what is missing from anyfct_ga4__event_page_view
model created by users.Is this something that we should pursue?
Is caching more effective than I give it credit?
Do you have any preferences for either of these methods?