department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
281 stars 197 forks source link

INTL - Sitewide Homepage Content copy #44072

Closed chloedotbrown closed 2 years ago

chloedotbrown commented 2 years ago

Issue Description

Duplicate Content Dashboard and separate data source for Sitewide Homepage team. Cannot add to Standardized Content KPIs due to data size and existing latency issues (to be addressed in future tech debt ticket, likely once Data Engineer role filled).

See #41503 for VFS-facing ticket.


Tasks

Data @chloedotbrown

Dashboard @chloedotbrown

QA @pavanhothi & @jonathan-epstein13

Delivery @chloedotbrown

Acceptance Criteria

chloedotbrown commented 2 years ago

Data QA - complete

Created a TEST_last_week_homepage view in the analytics_testing folder in BigQuery. This mirrors the logic in the current Domo Content Engagement connector query, but instead of joining with the content table, data is filtered to only pull in the homepage. Fields from the content table, such as product and page_name, are hard-coded.

Reference: BigQuery QA instructions

Structure tests

Uniqueness

Passed! Total rows is the same value as the count of distinct hit IDs (combination of session and hit number).

Conducted on 7/3-7/9 data.

Total rows Unique IDs
1,347,636 1,347,636
Expand for test code ```sql select count(*) as total_rows, count(distinct concat(session, hit_number)) as total_uuid from`vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage`; ```

Nulls

Passed! Identified errors in underlying content last week view in BigQuery introduced in April edits. Added page detail data back in and switched subqueries to CTEs to improve readability, both for last week and full backfill versions. There remain some areas of improvement for the link_label field, affecting both content and homepage data – notably, some CTA are resulting in nulls where they shouldn't. These will be wrapped into the content QA to be added to tech debt.

Conducted on 7/10 data – full week of data too large to process efficiently.

Column Null rate Explanation
search_keyword 100% ✅ Not used in any cards or calculations, deleting field from content views and connectors before backfill
event_label 84% ✅ Null for all pagehits, corresponds with ratio of page vs. event hits
event_action 84% (see above)
mobile_brand 60% ✅ Corresponds with device_category = desktop rates for each (values only available for mobile, tablet).
mobile_model 60% (see above)
previous_page 41% ✅ Corresponds with percent of rows that where pageview was first hit of session
time_to_event 41% ✅ Corresponds with percent of rows where hit was the first one included in content view, and therefore nothing for lag function to process.
time_on_page 41% ✅ Corresponds with percent of rows where pageview was the first one included in content view, and therefore nothing for lag function to process.
link_label 9% ✅ Corresponds with percent of scroll event rows, where label is null.
Expand for test code ```sql with content_nulls as ( select col_name, count(1) nulls_content, 'a' as join_key from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_content` t, unnest(regexp_extract_all(to_json_string(t), r'"(\w+)":null')) col_name group by col_name ), content_all as ( select 'a' join_key, count(*) as total_content from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_content` ), pct_content as( select col_name, round((nulls_content/total_content), 2) as pct_content from content_nulls left join content_all using (join_key) ), home_nulls as ( select col_name, count(1) nulls_home, 'a' as join_key from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` t, unnest(regexp_extract_all(to_json_string(t), r'"(\w+)":null')) col_name group by col_name ), home_all as ( select 'a' join_key, count(*) as total_home from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` ), pct_home as( select col_name, round((nulls_home/total_home), 2) as pct_home from home_nulls left join home_all using (join_key) ) select * from pct_content full join pct_home using (col_name) ```

Completeness

Passed! Checked row distributions for main filters to be used in new dashboard.

Expand for test code ```sql /************************************** NOTE: Highlight and run separately **************************************/ -- Completeness test - Date select date, count(*) as total_rows from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` group by 1 -- Completeness test - Device select device_category, count(*) as total_rows from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` group by 1 -- Completeness test - Browser select browser, count(*) as total_rows from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` group by 1 order by 2 desc -- Completeness test - Page title (multiple for language variations) select page_title, count(*) as total_rows from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` group by 1 -- Completeness test - Product (should be 1) select product, count(*) as total_rows from `vsp-analytics-and-insights.analytics_testing.TEST_last_week_homepage` group by 1 ```

Accuracy tests

Conducted on 7/3-7/9 data.

Reference: QA for standardized dashboards; WIP Content QA spreadsheet

Highlights

Chart Title Pass/Fail
Usage trends - unique users
Usage trends - total interactions
Unique users - new vs. returning
Unique pageviews  ✅
Total clicks by device

Task completion

Chart Title Pass/Fail
User interaction rate  ✅
Total interaction clicks  ✅
Avg. clicks per user  ✅
Interactions with most clicks  ✅

Ease of use

Chart Title Pass/Fail
Avg. scroll depth distribution  ✅
Avg. minutes on page  ✅
Avg. scroll depth  ✅

Findability

Chart Title Pass/Fail
Top pages previously viewed  ✅
Devices used to access the page  ✅
Top browsers used to access the page  ✅
chloedotbrown commented 2 years ago

BigQuery data QA is complete. The new Sitewide Homepage Content connector is currently running, as is a test dataset for the content dashboard, which I'll give a quick check tomorrow before running a real backfill to the production content data. Created new ticket to formally QA Content dataset - it's assigned to me, is attached to our Tech Debt epic, and is in the Icebox column for now.

Next steps for this ticket's work - once backfill has completed tomorrow, I'll complete the dashboard build and send for QA.

chloedotbrown commented 2 years ago

@michelle-dooley

Current completed work

Data: Even after separating homepage data from the existing Content data source, the event-level data was still too large for Domo cards to process. Therefore, I've created 3 datasets of based on the vw_content_engagement view in BQ that transform the data enough to make it manageable while still allowing enough flexibility for accurate counts when filtering/grouping by dates. New datasets include:

Dashboard: WIP version of Site-wide Homepage Engagement dashboard is created, cards are connected to new data, and access has been granted to everyone on the Analytics team.

Next steps

Data: During QA, I realized that the underlying vw_content_engagement page does not capture a significant portion of links on the homepage, such as "Zone One" or "Main Button" clicks. I suspect this is because these configurations primarily exist only on the homepage and therefore weren't included when Jason created the Content views originally. However, because I'm not 100% certain that making changes here won't impact all our other content products, I'm going to create a vw_homepage_engagement and '..._last_week` views in BQ and reconnect the 3 Domo datasources to them instead.

In addition, I've verified with Michelle M that, not only is it important to include this data, but it would really useful to them to be able to filter by location on page. Since this has taken so long to create and they'll now need their own view anyway, I've offered a different engagement_type taxonomy for their dashboard as a new feature. I don't think this will add much time on my end, but it will make the dashboard much more useful to the Homepage team.

Dashboard: Once data is backfilled on the new view, I'll do the Data QA, since I've been working so closely with it and can integrate any changes needed quickly. However, it would great to give Design QA and any fixes needed to Jonathan or Pavan.

chloedotbrown commented 2 years ago

Data QA checklist

Dates checked: July 31 - August 6, 2022

Highlights

Card title Data assets Pass / Fail
Total sessions Audience / Overview / Segment where page = homepage Pass ✅
Sessions w/ interactions Queried daily GA tables in BQ Pass ✅
New vs. returning sessions Queried daily GA tables in BQ Pass ✅
Sessions by device Audience / Mobile / Overview / Segment where page = homepage Pass ✅
Total pageviews Behavior / Site / All Pages

Task completion

Card title Data assets Pass / Fail
Interaction rate Queried daily GA tables in BQ Pass ✅
Total interactions by type Depended on category - Queried daily GA tables in BQ or used Behavior / Events / Top events with page + event label filters Pass ✅
Total interactions (all) Queried daily GA tables in BQ Pass ✅
Avg. interactions Queried daily GA tables in BQ Pass ✅
Top interactions Custom report to check totals and event action order Pass ✅

Ease of use

Card title Data assets Pass / Fail
Scroll depth distribution Custom report to check totals Pass ✅
Avg. min on page Queried homepage view in BQ Pass ✅
Avg. scroll depth Queried homepage view in BQ Pass ✅

Findability

Card title Data assets Pass / Fail
Previous page Queried homepage view in BQ Pass ✅
Top browsers used Audience / Technology / Browser / Segment where page = homepage Pass ✅
chloedotbrown commented 2 years ago

@pavanhothi - the Site-wide Homepage Engagement dashboard is ready for design QA. You can find instructions, markdown templates, and a spreadsheet version of the usability checklist here.

This dashboard should have a more rigorous design QA than most, given the risks of making a copy of an existing dashboard. In particular, please give extra attention to the following:

I'll start a thread in our team's Slack channel for any questions that may come up as you go!

michelle-dooley commented 2 years ago

@chloedotbrown and @pavanhothi - I have completed the 1st round of design QA, see below. For the most part everything looked great!! I made some text edits where I could but there were a couple things I'm not sure how to do. @pavanhothi - could you please take a look at the ones I marked failed and correct please. Could you also please double check the drill paths, I think they are good but since Chloe called them out I think another set of eyes would be good.

Test | Expected Behavior | Actual Behavior | Pass/Fail -- | -- | -- | -- Dashboard title | Text fully readable, without errors |   | ✅ Date filter | Every chart gets filtered for date and group by settings are changed | The following cards do not update when the date selection is for any time perriod in current year:- Total pageviews- Total interactions- Avg interactions- Avg minutes on page- Avg scroll depth | ❌ Section titles | Every section title & subtitle are spelled correctly |   | ✅ Section resources | Links to How-to Guide + Data Dictionary (if available) |   | ✅ Chart titles | Every chart title is fully readable, without errors, relevant to chart |   | ✅ Chart descriptions | All chart descriptions are disabled, so nothing appears when hovering over title |   | ✅ Hover text | All hover text "makes sense" and is spelled correctly |   | ✅ Annotations | Every annotation "makes sense," is free of errors, and visible within text box |   | ✅ Filter cards | Filter cards are easy to read and use |   | ✅ Text box formatting | Every annotation and section title box shows all text, does not generate scroll bar |   | ✅ Card interactions – filters | Clicking data point does not filter other cards | | ✅ Card interactions – card detail | Chart title opens card in new tab | The following cards open in current tab, not new tab- "Total interaction by type"- "Top interactions (drill in place)" | ❌ Card drillpaths | Card detail view filters to correct drillpath _(if using)_ |   | ✅ Display settings – date units | Date grouping ("by week/month") appears under each title |   | ✅ Display settings – multi-value gauges | Directionality of colors in period-over-period comparison makes sense |   | ✅ Display settings – summary numbers | All non-multivalue gauge charts without a date axis show date range in summary number |   | ✅ Related cards | In card detail view, all related card previews have been deleted |   | ✅ Appendix | No charts or notecards appear in appendix below dashboard |   | ✅
pavanhothi commented 2 years ago

@michelle-dooley thank you! I will go in and and make any necessary corrections. And I'll do another check on drill paths etc.

pavanhothi commented 2 years ago

@michelle-dooley @chloedotbrown I have completed design QA from my end. Some notes based on QA above:

This looks great Chloe!

michelle-dooley commented 2 years ago

Thanks @pavanhothi @chloedotbrown - can you look into the following remaining items....

chloedotbrown commented 2 years ago

Thanks @michelle-dooley and @pavanhothi ! Here are the final checks Michelle flagged:

With that, I think we should be good for Design QA! Michelle, feel free to take a final look and let me know if this is good to close out and send to Michelle M?

michelle-dooley commented 2 years ago

Awesome!! @chloedotbrown I thought that I might of not be been interpreting the multi-gauges correctly, thanks for clarifying!! Sounds like we are good to sign off on this one, YAY!! Please go ahead and let "other Michelle" :) know.

chloedotbrown commented 2 years ago

Perfect! Closing out this internal ticket in the meantime.