Open dgitis opened 7 months ago
@adamribaudo-velir, while I've verified that these work locally, I've only verified with a minimal amount of data. Multi-site changes were verified by setting the same property ID several times under property_ids
. Obviously this is not ideal.
It would be best to verify this using a proper multi-site installation. I don't have any current clients, but I can ask some past clients.
I did not write any tests. dbt recommends testing certain things. The only logic worth testing , the multi-site macro, is not testable by the new framework.
I defaulted these new tables to +enable: false
.
I also ran base_ga4__pseudonymous_users
in to stg_ga4__client_ids
. I renamed at the staging model because that's where the client IDs are generated.
I'm on board with the approach, but unable to run it due to this error. Couldn't find an obvious issue. I can run the main branch fine with the same project configuration.
20:08:18 Completed with 1 error and 0 warnings:
20:08:18
20:08:18 Compilation Error in model base_ga4__events (models\staging\base\base_ga4__events.sql)
can only concatenate str (not "int") to str
> in macro combine_property_data (macros\combine_property_data.sql)
> called by macro run_hooks (macros\materializations\hooks.sql)
> called by macro materialization_incremental_bigquery (macros\materializations\incremental.sql)
> called by model base_ga4__events (models\staging\base\base_ga4__events.sql)
I fixed the duplicate code issues which seems to have fixed the error, not sure why though, and also updated a test that worked on my very small test site but wouldn't work on a larger site.
I get an error when building the new staging models:
(python312env) C:\GitHub\velir-ga4-test-project>dbt run -m stg_ga4__client_keys
13:20:47 Running with dbt=1.8.7
13:20:48 Registered adapter: bigquery=1.8.3
13:20:48 Found 42 models, 29 data tests, 1 seed, 3 sources, 623 macros
13:20:48
13:20:50 Concurrency: 4 threads (target='dev')
13:20:50
13:20:50 1 of 1 START sql view model dbt_dev_aribaudo.stg_ga4__client_keys .............. [RUN]
13:20:51 BigQuery adapter: https://console.cloud.google.com/bigquery?project=velir-website-analytics&j=bq:US:660c1784-627b-4503-b4e5-a08e57355482&page=queryresults
13:20:51 1 of 1 ERROR creating sql view model dbt_dev_aribaudo.stg_ga4__client_keys ..... [ERROR in 1.17s]
13:20:51
13:20:51 Finished running 1 view model in 0 hours 0 minutes and 2.93 seconds (2.93s).
13:20:51
13:20:51 Completed with 1 error and 0 warnings:
13:20:51
13:20:51 Database Error in model stg_ga4__client_keys (models\staging\stg_ga4__client_keys.sql)
Syntax error: Expected ")" but got identifier "user_property_name" at [11:82]
compiled code at target\run\ga4\models\staging\stg_ga4__client_keys.sql
13:20:51
13:20:51 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
I have 1 user property defined:
user_properties:
- user_property_name: "random_number"
value_type: "int_value"
I'd be ok approving a PR with JUST the base model data for user export so that package users at least have access to that data. We could sort out the most advanced stuff (pulling in properties, audiences) later.
Description & motivation
Work-in-progress resolving #285.
The goal of this PR is to support the
pseudonymous_user_id
anduser_id
tables.My initial thought is that we should keep both the package's current user tables and new ones derived from the new user export options in GA4. The reason for this is to support both old and new installations.
README modifications are not done, but I expect we will want to add a new section on disabling user models that explains the differences between our various models and how to use them.
What defaults should we have for
+enable
on these tables?This PR supports
audiences
fields from the export. While working with a test site that has implemented audiences, the data doesn't seem to be very useful unless its ID field joins with Google Ads audience exports. Despite this, I think we should leave this in because it needs to be enabled by a variable.The
user_properties
fields were done without sample data. Naming and logic will likely need to be updated.Multi-site is not currently supported.
We also might want to move the logic in
base_ga4___*
models tostg_ga4__*
models.I also need to review package naming conventions to ensure data, like geo data, is consistent with elsewhere in the package.
To-do:
user_property
logicHere are the docs for the source tables.
Checklist
dbt test
andpython -m pytest .
to validate existing tests