mitodl / ol-data-platform

Pipeline definitions for managing data flows to power analytics at MIT Open Learning
BSD 3-Clause "New" or "Revised" License
36 stars 6 forks source link

MicroMasters Summary report #528

Closed pdpinch closed 1 year ago

pdpinch commented 1 year ago

User Story

As a member of a MicroMasters team, I'd like to have a top-line summary of data so we can track the health of the program overall.

Description/Context

We currently have a report at https://bi.odl.mit.edu/queries/1056 that uses data collected in a google sheet https://docs.google.com/spreadsheets/d/13tIKdblVxdfrdfe_dhIwtbCYWxTEi903-ahLE6DJ8w8/edit#gid=0

The script that is used to produce this summary has to be run manually and can no longer be maintained. We need something that is run on a regular basis (at least once a month) and is more reliable. In particular, we expect the number of total enrollments and unique learners to be always increasing.

Acceptance Criteria

Summary data for:

Related issues

abeglova commented 1 year ago

This is completed for enrollments but not verified learners and certificates

abeglova commented 1 year ago

Also, we currently don't import marts into bi. That should be an easy fix for devops

pdpinch commented 1 year ago

@abeglova I got a question from the DEDP team that I could probably answer, but I'd like to verify with you:

I'm writing with a follow-up question to the data you shared at the last MicroMasters meeting. If I recall correctly, your table listed ~300,000+ unique DEDP learners. Our course team has counted ~50,000 learners, so we're curious to know how you're arriving at your number, and what we're doing differently.

Can you briefly explain our sources and method?

I suspect they don't have access to the full extent of edx enrollment data. I think the first run of 14.73x alone had more than 40,000 enrollments.

pdpinch commented 1 year ago

@abeglova and I discussed this in person and confirmed that the enrollment total includes enrollments from edx.org and from mitxonline.mit.edu and is not limited to users who have created micromasters portal accounts.

pdpinch commented 1 year ago
This is partially done, which can be seen in the data mart ol_warehouse_production_mart.marts__micromasters_summary   program_title total_enrollments unique_users unique_countries
Supply Chain Management 1,038,566 493,487 251  
Data, Economics, and Development Policy 591,913 329,810 245  
Statistics and Data Science 848,292 512,995 251  
Finance 221,043 136,632 240  
Principles of Manufacturing 245,296 136,225 237
total 2,945,110 1,412,204 251  

The next step is to add the following columns for every program except DEDP:

cnt_of_verified_enrollments cnt_of_verified_enrollments_unique_learners number_of_course_cert course_cert_unique_learner_cnt cnt_of_programcertificate date_report
pdpinch commented 1 year ago

@rachellougee I hacked together a couple of queries to fill in the remaining columns for every program except DEDP. Can you review my queries?

DRAFT Verified Enrollments for MicroMasters Summary report: https://bi.odl.mit.edu/queries/1258/source DRAFT Course Certificates for MicroMasters Summary report: https://bi.odl.mit.edu/queries/1257/source

I'd like to share the SDS and SCM numbers today, if possible.

rachellougee commented 1 year ago

@pdpinch Per discussion, this looks good to me

pdpinch commented 1 year ago

One more quick query, for program credentials. Again, the DEDP number is wrong.

DRAFT Program certificates for MicroMasters Summary report: https://bi.odl.mit.edu/queries/1260/source

pdpinch commented 1 year ago

@abeglova I tried refreshing the report at https://bi.odl.mit.edu/queries/1259 (which uses ol_warehouse_production_mart.marts__micromasters_summary). It seems like many of the counts went down from April, which can't be correct. I've highlighted the values that went down in red this sheet: https://docs.google.com/spreadsheets/d/159CfV3doQ1_jYLn4SjaXJSFFmuk8FyQaRbKsNV9Qxjg/edit#gid=386779908

Can you take a look? Also, how/can we add some tests for this?