Closed joanneesteban closed 3 years ago
In conjunction with #6356
BQ - All Transaction Events by Day
BQ - All Product Form Events by Day
BQ - Generic - Overall PV's Sessions, Users, Bounces by Day
@jonwehausen @bsmartin-ep do we have updates on:
Document what ETL funnels might look like Document how time to complete might look like
and if we can do them for the KPI dashboard switchovers next sprint? It looks like Caregivers has one of these metrics.
Monthly - OMB Dashboard has been migrated to BigQuery,
The ETL process that creates the final output table used in all the cards has been updated to use three components:
Thank you @bsmartin-ep!
Also side point / update for @joanneesteban: For prior needs to Magic ETL before BigQuery for things like data type changes, scrubbing, parsing, etc, we're able to do more of these basic ETL needs within our SQL queries directly, so the data is ready to go when pulled into DOMO. Of course this assumes the use of those functions doesn't create a major cost inflation, which we have yet to see.
Thank you for the updates, @bsmartin-ep @jonwehausen ! Can we test out the time to complete ETL with Caregiver on Staging?
assumes the use of those functions doesn't create a major cost inflation, which we have yet to see.
Would like to see if effort levels are lower, data integrity is there, and ^cost isn't incredibly inflated.
Also, idea for super low-lift (and by super low-lift, I mean, just one metric instead of many, so who knows) Prometheus MVP would be the one "system availability" metric on the OMB dashboard.
Card 'Total Pageviews' on the VEO dashboard has been switched over to the BQ - Generic - Overall PVs, Sessions, Users, Bounces by Day
dataset.
I added a Beast Mode field to filter the datatable to anything before the previous month so that we don't display incomplete month data erroneously as a drop.
(CASE
WHEN
YEAR(`DATE`) < YEAR(CURRENT_DATE()) OR
(YEAR(`DATE`) = YEAR(CURRENT_DATE()) AND MONTH(`DATE`) < MONTH(CURRENT_DATE()))
THEN 'Y'
ELSE 'N'
END)
@jonwehausen -
2 x Total Pageviews and 2 x Users cards (OMB and Monthly Council) have been migrated from dataset Va.gov & Vets.gov PVs and Users.
This required creating a modernized BQ dataset and a Beast Mode field (dataset-scoped) to only show data before the previous month.
Pageviews look pretty close.
Users are dramatically different as you'd expect going from GA --> BQ
Before:
After:
I think that's it for these old GA datasets...
That is a very dramatic difference...is that also after de-duping?
@joanneesteban / @jonwehausen -
Cards have been updated to new monthly BQ datasets. Looking a lot closer now.
Old | New BQ |
---|---|
Also updated the pageview cards.
Old | New BQ |
---|---|
Not quite the exact match on Pageviews I was expecting, but curiously our "new" number matches the July 2020 numbers more closely:
(89.93M vs the new 92.39M)
Great! Thanks, @bsmartin-ep ! Do you know why the July numbers for page views would have changed more than the others?
@amycesal , since these reports will start to go out next week, if we're ready to switch over the datasets to BigQuery let's start thinking about how to add annotations that display what the changes are and the value.
Do you know why the July numbers for page views would have changed more than the others?
I think I figured out the issue with pageviews @joanneesteban. The old GA report is configured to run on the last day of the month minus one day (and not the first of the next month like it should probably be). So we've been truncating the last day of pageview data from our monthly total.
HOWEVER - it's also re-fetching all data back to 2018 every time it runs, so when the report runs, it fixes the last day of the previous month.
@joanneesteban -
Pageview dataset has been fixed and back-filled to correct July 2020.
I also adjusted the scheduling and back-filled this one:
This one also had the issue, but I don't think it should be running anymore. It's pointing to the old vets.gov site and is only (somehow) contributing a few ambient pageviews every month. Maybe a cached copy on archive.org?
Thanks, @bsmartin-ep ! Yes, if that's grabbing vets.gov info, cached copy is probably better.
@bmcgrady-ep -
eBenefits is ready for your QA.
Dashboard Name | Type | Old | New BQ |
---|---|---|---|
eBenefits - KPIs - WIP | KPIs | https://va-gov.domo.com/page/177811372 | https://va-gov.domo.com/page/1726972169 |
The old datasets were running (maybe incorrectly) as a rolling 60 day REPLACE. I fixed them to start appending a few days ago. Nontheless, the BQ and GA datasets may have different overall date ranges. Pretty close (especially if you filter down to a matching data range at the card level) but not exact.
The bounce rate will be dramatically lower on the BQ dashboard due to the artificially high bounce rate of the dedicated eBenefits GA view we were using. The more comparable bounce rate would be if you filter pages in the All View to just these pages and looked at the Bounce rate for them:
AND REGEXP_CONTAINS(
hits.page.pagePath,
r'^(eauth\.va\.gov|www\.ebenefits\.va\.gov)'
)
AND NOT REGEXP_CONTAINS(
hits.page.pagePath,
r'(/mhv-portal-web|/web/myhealthevet/)'
)
Thanks and let me know what you find.
@bsmartin-ep - Overall, QA looks good. Few things to note:
@bmcgrady-ep -
Can you also note if these look OK?
@bmcgrady-ep -
VAOS dashboard ready for your review. Thanks!
Dashboard Name | Type | Old | New BQ |
---|---|---|---|
VAOS - KPIs | KPIs | https://va-gov.domo.com/page/565662008 | https://va-gov.domo.com/page/1769944412 |
@bmcgrady-ep -
BAM2 Medical Device dashboard is also ready for your review. Thanks!
Dashboard Name | Type | Old | New BQ |
---|---|---|---|
Medical Device Reordering Tool - KPIs | KPIs | https://va-gov.domo.com/page/1133411197 | https://va-gov.domo.com/page/737992383 |
@bsmartin-ep VAOS Dashboard Review
BAM2 Medical Device Dashboard Review
@bmcgrady-ep -
Both ready for re-review.
@bsmartin-ep VAOS - The Number and frequency of Veterans returning to VAOS card in the GA dataset looks to be double counting.
BAM2 - The Appendix still looks a little off
@bmcgrady-ep -
Chatbot KPI dashboard ready for your review:
Dashboard Name | Type | Old | New BQ |
---|---|---|---|
COVID Chatbot - KPIs | KPIs | https://va-gov.domo.com/page/62901069 | https://va-gov.domo.com/page/986858114 |
Thanks!
@bmcgrady-ep -
FYI - I'm blocked on the Facility Locator KPI page, so nothing to review on that yet.
fl-
events. Moving this issue to QA now. Thinking with all this new data (and our blockers) it might be a good idea to kick off a redesign of the FL KPI page.
@bsmartin-ep COVID Chatbot Review
Thanks for reviewing @bmcgrady-ep.
Looks good (other than the scale getting bumped up) now that I backfilled the GA one.
vs
Forgot to restrict it to mobile and modified the query to look at the immediate next event. Looking good!
vs
This one? What differences are you seeing?
vs
@bsmartin-ep - Okay now the Left chart is looking good. Sorry for the delayed response. Are there any other dashboard you need reviewed?
@bsmartin-ep Wanted to get in writing thoughts / next steps on our current solution to overcome our daily export / query timing discrepancy issue, here are potential next steps:
As a quicker win to our end goal, we could first: 1) Add this snippet to all of our current queries in domo
IF (
SELECT
COUNT(1) > 0
FROM
`vsp-analytics-and-insights.176188361.__TABLES__`
WHERE
table_id = CONCAT('ga_sessions_', FORMAT_DATE(
'%Y%m%d',
DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
))
) IS FALSE THEN
RETURN;
ELSE
2) Adjust the scheduling in domo to be "Advanced" for every 30 minutes beginning at 7:45
am EST
Following, we could 1) Convert and save all of our current queries to views in BQ 2) Write a stored procedure to do the IF-ELSE check in BigQuery first 3) Adjust the queries in DOMO to query each view (simplify the DOMO query) 4) Verify advanced scheduling is still configured to every :30 minutes beginning at 7:45 am
cc: @joanneesteban
Thanks for the work on this! Closing this ticket.
Issue Description
How might we switch over GA Domo datasets to BigQuery?
Tasks
[x] Set up accounts
[x] Use Domo connector
[x] Write queries
[x] Test queries
[x] Write out BigQuery<>Domo schema for GA
[x] Backup current Domo charts
[ ] Processes are documented
[x] Migrate charts from GA to BigQuery
Acceptance Criteria
[ ] Understand how to switch over GA Domo datasets to BigQuery
How to configure this issue
product support
,analytics-insights
,operations
,service-design
,tools-be
,tools-fe
)backend
,frontend
,devops
,design
,research
,product
,ia
,qa
,analytics
,contact center
,research
,accessibility
,content
)bug
,request
,discovery
,documentation
, etc.)