usgs-makerspace / makerspace-sandbox

Some initial R code for playing with data processing (maybe some light visualization).
Other
0 stars 5 forks source link

Get Sample Data for Entry and Exit Pages #409

Open abriggs-usgs opened 4 years ago

abriggs-usgs commented 4 years ago

In creating the Analytics Dashboard we want to track an individual user’s navigation through a website so that we can understand if the approaches used in current applications for navigation are successful.

To accomplish that we need . . .

Completion Criteria -

wdwatkins commented 4 years ago

I uploaded landing_exit_pages.csv in internal-wma-test-website/analytics/data/dashboard-test. This contains entry, second, and exit pages for the last year for a few different applications, grouped by session. image

Let me know if a different format would be more convenient. We may want to look into using a binary format down the road, this kind of data can get quite long and balloon in size with ascii.

View name and view id distinguish each application.

wdwatkins commented 4 years ago

https://github.com/usgs-makerspace/analytics_pipeline/pull/1

mhines-usgs commented 4 years ago

is there some way to get a more horizontal view of this? e.g. have a session, landing page, second page, exit page layout? right now it seems like the session is not unique? eg i see many session = 4 records but with many landing/second/exit page combinations

wdwatkins commented 4 years ago

The sessions column is the number of sessions that involved those landing/second/exit pages. So it isn't a unique identifier, and we would expect there to be many combos with a low number of sessions. Does that make sense?

mhines-usgs commented 4 years ago

ok, is there a unique identifier here that allows us to get the individual user flow? i think that's what we're looking to produce here with this task, unless i'm misunderstanding the user story

wdwatkins commented 4 years ago

We don't have a unique identifier without implementing the client ID tracking, like we discussed in the user story meeting. This is about as granular as you can get out of a vanilla Google Analytics setup. Right now the only real multi-page application that has this is actually the current analytics dashboard 😄 WBEEP and the 2015 water use data viz do to, but they are mostly just one page products that record click events. I can pull data from the dashboard if you want to play with it though.

This current data I think gets us part of the way there on this user story. On an application with few pages, or an application with tons of traffic we would expect there to be multiple users following the same paths, so an aggregated view like this is worth something. This is the best we can do with API data on a consistent basis right now. We should think about how useful this data is compared to the user flow chart that GA already provides in the interface: image

mhines-usgs commented 4 years ago

ok, i guess we might want to revisit this user story then, since it's from an individual user perspective, otherwise, should we just filter out duplicates by session then, and use that as a fake individual user? if that makes sense?