lzim / teampsd

Team PSD is using GitHub, R and RMarkdown as part of our free and open science workflow.
GNU General Public License v3.0
9 stars 23 forks source link

Story J: User needs more accurate end-user time reporting of events in Sim UI and Data UI #2348

Closed jamesmrollins closed 1 year ago

jamesmrollins commented 2 years ago

Time Logging Inaccurate

When the user logs into the Sim UI and begins a session, if they don't close the browser or log out, the time clock keeps ticking. This leads to inflated user session hours.

jamesmrollins commented 2 years ago

At 11/17 Workgroups Leads meeting, decided to move this to January Epic

jamesmrollins commented 2 years ago

One way to improve accuracy of time hacks, is to automatically log out the user when the page lays fallow.

See example below from VA Citrix

image

dkngenda commented 2 years ago

Hi James, should we move this to the ICE BOX instead of closing it?

jamesmrollins commented 2 years ago

Sorry @dkngenda - I meant to close #2338 as it was a duplication and got it backwards.

jamesmrollins commented 2 years ago

As per conversation with @lzim and @lijenn SWG 1.31

  1. Not sure having an auto-close is a useful feature. While it would improve data collection accuracy. But creates an inconvenience for our end users.
  2. @jamesmrollins is going to investigate SmartLook to see if we can export actual interaction time with the Sim UI via an API.
jamesmrollins commented 2 years ago

@hirenp-waferwire - We need to investigate if the SmartLook people provide an API. If so, can we export all the time a user is engaged with their Sim UI?

jamesmrollins commented 2 years ago

@lijenn @mnallajerla @lzim

Results of SmartLook API investigation

image

Recommended Next Steps

jamesmrollins commented 2 years ago

Discussed at the 2/16 Sim UI Meeting.

REF: PowerBI Story 13 and #2504

jamesmrollins commented 2 years ago

FYI @lzim @jeffhoerle @lijenn @mnallajerla

Reviewed IIR Grant today for potential measurements regarding user behaviors.

From Zimmerman_HSRD_AIMS_Strategy_SUMITTED.docx.

Page Source Potential Measure
p. 8 Table 3 - Increase Learning/General QI Capacity Trending increase of team and individual interactions with Data UI and SIm UI after session_12
p. 14 "Encourages safe prototyping of ideas, experiential learning and accumulates in new staff QI capacity over time" Trending increase in teams' Team Data bookmarking, export and 100% experiment completion after session_12.
p. 14 "Stakeholders make theory-based, causal attributions about the system and QI plans that are structured in a model, and validated with calibrated parameters from health system data.107,110 A significant advance beyond self-report of implementation ‘barriers and facilitators’ alone" Measurement same as above.
jamesmrollins commented 2 years ago

Support Workgroups Meeting 2/23:

From @jeffhoerle :

PowerBI Capabilities

The PowerBI Platform does have basic reporting tools for user interactions through "View User Metrics Report." These reports can be output via xlsx.

Automation Available for PowerBI via API.

In theory, there is likely a PowerBI API that could move this information between PowerBI and the Research Repo, of which theResearch Repo is not inside the domain. That would enable the Data Scientist to do R notebook calls to PowerBI to automatically pull this data. However, API calls from outside the domain are currently not allowed.

SmartLook Capabilities Via API

Sim UI Capabilities Via API

jamesmrollins commented 2 years ago

FYI @lzim @lijenn @mnallajerla @jeffhoerle

Thank you @lijenn for finding the R01 reference displayed below and for directing my attention to the appropriate grant documents.

From TeamPSD > Grants > NIH > R01_Resubmission_Mar2018 > ZIM.RESUB.R01.4.10.18.AppImageDraft.Final, p. 166

Platform Measure Measure Available Y/N? Alternative/comments
Power BI # Learners Yes N/A
# Unique Visits Yes N/A
# Clicks No Page Visits
Sim UI # Clicks Yes Page Visits, Experiment Completion %
# Simulations Yes See question below
# Sims Shared Yes See question below
Sim Inputs and Outputs Yes total completed QHFD?

Questions - Updated from Sim UI Meeting 3/2

From R01

image

From TeamPSD > Grants > HSRD > iir_resub3_2018_12 > zimmerman_2019_iir_irb_scientific_protocol.

jamesmrollins commented 2 years ago

Sim Ui Meeting 3/2

jamesmrollins commented 2 years ago

As per conversation with @dkngenda:

  1. APIs to move Page Access by user, by team, information between PowerBI and Sim UI to the corresponding R notebook is preferred. We need to confirm that PowerBI can link the user to the page access, vice page access statistics by team only.
  2. @jamesmrollins needs to investigate PowerBI API potential usage inside the domain, and if it will report the required information.
  3. @jamesmrollins will consult with @hirenp-waferwire regarding APIs and what may be required.
jamesmrollins commented 2 years ago

FYI @jeffhoerle @dkngenda @lzim @lijenn @mnallajerla

Hi @hirenp-waferwire - Please investigate:

jamesmrollins commented 2 years ago

FYI @lzim @lijenn @dkngenda @mnallajerla

@hirenp-waferwire - To access the SmartLook Rest API, we have to buy up to the Business level ($95/mo) AND pay a $269/mo fee to use it. This is too cost prohibitive. So, we should:

jamesmrollins commented 2 years ago

Hi @lzim @lijenn @mnallajerla @dkngenda

@hirenp-waferwire has determined how to get page access counts from Google. However, we have to do each query by user for a given date range. We can get the usernames from Epicenter and pair with the query, but we would still need some bounding variables so the search can be more efficient. Also, it would seem we would want to aggregate page access counts by group, versus by user. Please verify - thank you.

jamesmrollins commented 2 years ago

FYI @lijenn @lzim

Hi @dkngenda - Attached is the instructions for using the Google Analytics API to get page access data. I also sent instruction and invitation emails to your va and gmail accounts. API.docx

dkngenda commented 2 years ago

@jamesmrollins

Tried the instructions out and got stuck on step 3. Briefly, this is what I attempted

  1. Generated a redirect URI as stated in step 2 (see screenshot below) image

  2. Added this redirect uri into the link provided in step 3 of the instructions (see highlighted section in link below) image

  3. Copied this link into google chrome address bar but then I got this error message image

Any ideas what am missing here?

Thanks

jamesmrollins commented 2 years ago

Hi @hirenp-waferwire - Please take a look above at @dkngenda issues with the API. Can you offer any advice? Let me know if you two would like me to set up a meeting.

jamesmrollins commented 2 years ago

FYI @lijenn @lzim

@hirenp-waferwire @dkngenda and I met last night to troubleshoot this API. There is an issue with the confluence of R code and the java that was used to develop the API. In other words, a webserver is required to use the API. Moreover, there is a multitude of data available since we installed the tracking cookies. We decided on the following:

  1. David will navigate to Google Analytics and manually download available data and consult with @lzim to determine which data parameters are needed.
  2. Hiren will investigate R code versions of the API, so they can be embedded directly into the R notebook. Failing an R notebook solution -
  3. We will investigate hosting the API on Epicenter and moving CSV data from Epicenter to a data repo in the research repo.
dkngenda commented 2 years ago

Hi @lzim @jamesmrollins

I went through google analytics data and documented data that I thought would be useful for Research/Ops activities. You can find the excel file here on Teams. @lzim if you could take a look and comment on the variable list, we can refine it further. @jamesmrollins: I was actually able to download this data directly through powerbi (and there is a lot more there) but I could not find a variable that captures a teamid on google analytics ... but you maybe able to.

FYI @lijenn @mnallajerla

jamesmrollins commented 2 years ago

FYI @hirenp-waferwire @lzim

Hi @dkngenda

@hirenp-waferwire wrote a sample code of google analytics API in R language. He checked it and it works. Please note that the Client ID and Client Secret are available in the file attached to the email sent 5/9 to your gmail account. It is the MTL self-registration APP, so no need to change the secrets.

Below are reference links.

  1. google analytics install
  2. metrics explorer
jamesmrollins commented 2 years ago

@dkngenda - did the above R code work for you?

dkngenda commented 2 years ago

@jamesmrollins Yes I am able to pull google analytics data into the notebook. However, I can't pull individual /teams data. @lzim Looks like we are able to track data for each page. Do we have specific pages we want to collect user data for.

jamesmrollins commented 2 years ago

Hi @dkngenda - thanks for the feedback. I will discuss with @hirenp-waferwire - I believe we had a programming strategy where we could enable an individual/team query by leveraging the Epicenter API, then loop it through the Google API. This can be accomplished with java, so we just need to figure it out with R code ( I think).

jamesmrollins commented 2 years ago

@dkngenda we will prioritize this after we complete the Team PSD 3.0 website update.

FYI @lzim

dkngenda commented 2 years ago

@lzim @lijenn

Is there anyone granting/reviewing requests to access the TeamPSD_datapipeline_trackers power BI dashboard. I see some coming through my VA email. Not sure if they have been acted on.

FYI: @jamesmrollins

jamesmrollins commented 2 years ago

@hirenp-waferwire we can turn our attention back to this task. @dkngenda needs a way to loop the search function by user so he doesn't have to put in the userid individually. We discussed using the Epicenter API to provide the userids.

FYI @dkngenda - I don't know who is responsible for granting/reviewing requests for access.

jamesmrollins commented 2 years ago

Hi @dkngenda - Approximately 2 weeks ago @hirenp-waferwire provided you with an API to loop request user IDs with Google Analytics to download use information for your R notebook. Did this code work for you - can you give us feedback please so I can close this task or modify the API so it works for you? Thanks, James

FYI: @lzim

dkngenda commented 2 years ago

Hi @hirenp-waferwire - I am still not able to link google analytics to simiUI. See the screen shot below to campare the two datasets.

image

As you can see the user ID in the two datasets are different. Is there a way of sending the ID from the simUI to google analytics so I can use it to link the two datasets?

FYI @jamesmrollins @lzim

jamesmrollins commented 2 years ago

Hi @hirenp-waferwire - please investigate and resolve if possible. Let me know if you need assistance.

FYI: @dkngenda @lzim

jamesmrollins commented 2 years ago

hi @dkngenda - @hirenp-waferwire says to try using the MTL User View in Google Analytics:

image

The data are available in google analytics:

image

I'm not sure if this is answering your question, please let me know if you need more information.

dkngenda commented 2 years ago

Hi @jamesmrollins - Unfortunately am still not able to pull the user id through an API call in Rstudio. I just read this article online (scroll all the way to the bottom of the article) that says you cannot get userid through an API call. However, it seems that you can create your own custom google dimension then load the user into it, which I can then pull into Rstudio through an API call. See how to create a custom dimension. @hirenp-waferwire might make more sense of this

jamesmrollins commented 2 years ago

Ok, @hirenp-waferwire and I will check it out.

dkngenda commented 2 years ago

@jamesmrollins and @hirenp-waferwire

I realize that last time you guys sent me a message on my VA email on this issue. I jus wanted to let you know that it's much easier to get my attention if you ping me on GitHub rather than over my VA email. If there are any updates on this issue let me know. Thank you

jamesmrollins commented 2 years ago

Hi @dkngenda . I don't think we will be able to derive userid based information from Google Analytics prior to July 8, 2022. The problem is the Google Analytics cookie gathered page view information, but assigned a unique ID number to the page visitor. There is no way to tell who the user is from that. However, we have added a custom dimension to Google, so now it is accurately reporting User ID, and should be exploitable from the API. Moreover, the SmartLook platform appears to have deleted much of our historical information for reasons I have not been able to investigate. If I can somehow get a mass download of information from the platform, that would be our only hope.

FYI: @hirenp-waferwire @lzim

dkngenda commented 2 years ago

@jamesmrollins @hirenp-waferwire

What the name of the custom dimension you created?

dkngenda commented 2 years ago

@jamesmrollins @hirenp-waferwire

I saw your email on the status of this work. I was able to use the API information you provided to pull some data which you can't really tie to a specific identifiable user. If you run the IIR R notebook chunks under the SIM data section, you will be able to see the information I'm pulling. However, as per my comment on Aug 4 on this this, I inquired about the name of the google user dimension you created so I can see if I can pull information tied to a user. Did you provide that information elsewhere other than on this issue?

Thank you

jamesmrollins commented 2 years ago

Hi @dkngenda - We reconfigured the API so that it should now be pulling user information by user ID. However, I hear you saying that the information is not available and you are asking if there are modifications to the API that you need to make to pull the right information. I will ask @hirenp-waferwire about this so perhaps we can clear this up. He will return to work Sunday night 9/4 and I will inquire with him. Thanks for your patience - James

jamesmrollins commented 2 years ago

@dkngenda I followed up with SmartLook to see if there were data we could download and to my horror found that all my data from years past was removed. The adjusted the license so that they only retain 30 days of data. So their only use for us is immediate troubleshooting of users who encounter problems where we need a video of their interactions. Otherwise their license for enhanced data collection is cost prohibitive; especially since google analytics is free, as long as you can arrange the API correctly.

FYI: @KimCase123 @lzim

hirenp-waferwire commented 2 years ago

Hi @dkngenda - The custom dimension name is "dimension1" for users' identifiers. The following image is a reference to how we used it for testing purposes. View Id is 265013385.

image

jamesmrollins commented 1 year ago

Hi @dkngenda - Can you verify if that dimension call above has solved the issue of getting data from Google Analytics? If you are satisfied with the API, then I can close this issue.

dkngenda commented 1 year ago

@jamesmrollins @hirenp-waferwire This is now working! Thanks for creating the UserId dimension

@lzim Below is an extract of the data as pulled into the R notebook. Are they other details, you would want to pull into this dataset?

image

jamesmrollins commented 1 year ago

Great news! Thanks @dkngenda and @hirenp-waferwire !