department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
283 stars 204 forks source link

Understand numbers of new and returning user sessions to MHV on VA.gov portal #89113

Open wesrowe opened 3 months ago

wesrowe commented 3 months ago

Description

User story

As a Cartography team member, I want to understand how many sessions are first-time users of the MHV-on-VA.gov portal versus returning users.

Notes

Possible tasks:

Acceptance criteria

wesrowe commented 2 months ago

I played around with Segments in GA4 explore reports – the reports are side-by-side, which I don't love. See first tab of this exploration.

I tried to use Comparisons (in the Pages and Screens report), but couldn't find the first_visit event in the comparison builder. Would love for this to work though! Will ping Jamie in #vfs-analytics.

wesrowe commented 1 month ago

Forgot to update... Jamie doesn't believe what we're trying to do is possible in GA. The exploration I linked above relies on GA4's OOB "new user" functionality, which means "new users for the GA property/domain." It can't be tailored to a subset of pages, i.e. the /my-health/ section of the site. So for example, someone who visited the unauthenticated Health care hub (/health-care/) last month and logs into /my-health/ for the first time today will be seen today as a return visitor.

wesrowe commented 3 weeks ago

I just poked around Datadog, looking for a way to see this. I found this out of the box report, which is coming pretty close because it shows how many sessions each unique user has had. Note that the top row's "blank user id" gets 11k.

Image

wesrowe commented 1 week ago

Based on the report/screenshot above:

Oh wait... What is N/A?? I found an advanced setting to "show N/A" and there are a lot of those. NOTE: the N/As bump up the total shown. When I "hide N/A" (the default), the total is much lower (by the amt of N/As). Image

wesrowe commented 1 week ago

The numerous-sessions-per-userID problem/thing seems pretty widespread. Here's the same session chart as above, but this time as a time series. Each color in a bar is a single user's sessions, except for the bottom section (which in every minute is the N/As lumped together). Image

This one is easier to see the colors (since I'm not hovering on one): Image

wesrowe commented 6 days ago

I have a slack thread going on the public DD slack regarding the high (>50%) rate of missing userIDs. Egor has provided some clues about how to check if our implementation has an issue. Anyone can sign up for this slack and participate.

wesrowe commented 6 days ago

I also have a support email thread going with JT Skidmore (DD). This is his suggestion for creating a new/return Metric (?) and assigning it to the user in our FE code...


you can track this information by adding a global context in your application's code for new users. This will allow you to obtain that metadata in your account, which you could then use to monitor new vs. returning users.

For example, you could create a check function in your application similar to the following:

    if(checkUserInDB(user)):
        returninguser_count_metric = 1
        return
    else :
      globalContext += newusertag
      newuser_count_metric = 1
      addToDatabase

To take it a step further, in the code snippet you add, you could also submit a custom metric for new/returning user count. This would allow you to then use these metrics to further monitor the count between both users.

The upside of doing this is the retention period for custom metrics is longer than the retention for RUM, so you could compare values month to month. For more information about the retention periods, this doc might be helpful.

dcloud commented 4 days ago

Per engineering refinement, this could be multiple tickets

  1. investigate why the app may be missing >50% userIDs
    • Are privacy tools able to block user id while still allowing DDG to record something?
    • Is code not reliably setting userId when an id becomes available?
    • Other possible reasons we might see (authenticated) users in DDG without the anonymous unique identifier
  2. solution for tracking new vs returning users (presumably using global context)
    • DDG doesn't have a mechanism to do this, appearently?