18F / identity-analytics-etl

ETL and SQL scripts for Login.gov data warehouse and business intelligence
10 stars 6 forks source link

Monthly Usage Stats in Login.gov Quicksight Dashboard #101

Closed akhlaqkhan closed 5 years ago

akhlaqkhan commented 6 years ago

We need to create a new dashboard for following monthly usage statistics in login.gov quicksight dashboard:

Here is a sample output from a query that @cacraig has already written to execute on-demand in prod.

    month        |  signups  | user_logins | logins

---------------------+---------+-------------+--------- 2017-09-01 00:00:00 | 7753 | 5823 | 27579 2017-10-01 00:00:00 | 622510 | 400331 | 2379657 2017-11-01 00:00:00 | 374397 | 473657 | 2168444 2017-12-01 00:00:00 | 289400 | 458332 | 1942335 2018-01-01 00:00:00 | 388868 | 564186 | 2368958 2018-02-01 00:00:00 | 594476 | 687835 | 2770145 2018-03-01 00:00:00 | 1314471 | 1308700 | 6971466 2018-04-01 00:00:00 | 1116055 | 1680305 | 9255338 2018-05-01 00:00:00 | 915314 | 1651717 | 8563375 2018-06-01 00:00:00 | 363127 | 975764 | 3601333

Since the order of magnitude for "Total logins" is several times higher than signups and unique user logins, and is expected to escalate at a high rate, it would make sense to keep these stats in two separate graphs: 1. Bar graph for signups and unique user logins. 2. Trend graph for Total Logins.

Also, it would be nice to have a filter to break these out on agency/app basis so that these stats can be shared with the customer agencies.

mugizico commented 6 years ago

@cacraig do queries or even views tables already exist for these? if yes, then creating the dashboard from them should be trivial using https://github.com/18F/identity-analytics-etl/wiki/Create-A-Custom-Dashboard-From-a-Redshift-View

cacraig commented 6 years ago

See the following views:
@mugizico

monthly_signups_with_user_logins
password_success_rate_by_month
monthly_signups

This is not quite as simple as adding it to the dashboard. These are super intensive queries that would need to be rolled up. We would need some kind of lambda/cron to run maybe once per day, and materialize these into a table.

akhlaqkhan commented 6 years ago

agreed with @mugizico. based on a brief chat with joel, i analyzed this more and figured this should be in a new dashboard along with other metrics that we need for OMB reporting. Here is a mockup of how i am envisioning it. feedback is welcome!

image

mugizico commented 6 years ago

@akhlaqkhan I may have spoken too early, I talked with Craig and like he mentioned it does seem like these queries are really intensive and can choke our overall data processing pipeline if they are run multiple times per day(like we do for other views).

I was doing some research and it looks like we might be able to do some SQL tuning. Our current dashboard is growing large as well and we should definitely break it apart, based on audience. Something we can discuss next week.

I like that dashboard. If we can figure out these query performance issues, then we should be able to create that dashboard pretty easily.

akhlaqkhan commented 5 years ago

Initial data sets needed by USAJObs:

mugizico commented 5 years ago

Sounds good! I don't think there is any Help Desk Metrics in Redshift, you will probably still have to get that from the vendor/customer center. This is the list of metrics that we collect https://github.com/18F/identity-idp/blob/stages/rc-2018-06-07/app/services/analytics.rb

On Thu, Aug 2, 2018 at 1:00 PM, Akhlaq Khan notifications@github.com wrote:

Initial data sets needed by USAJObs:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/18F/identity-analytics-etl/issues/101#issuecomment-409997212, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-w4BBl1StMSNu8JMT5jehKCSaV4mK9ks5uMzAbgaJpZM4U2tHS .

-- Jean Paul Mugizi 18F Engineering Team U.S. General Services Administration

lauraGgit commented 5 years ago

Migrated to jira