mitodl / mitxonline

BSD 3-Clause "New" or "Revised" License
4 stars 2 forks source link

MM Data Migration: legacy final grades from MicroMasters #658

Closed rhysyngsun closed 2 years ago

rhysyngsun commented 2 years ago

We need to bring over the legacy grades for DEDP, these are spread out over a couple different models on MicroMasters:

Whenever a learner gets a new record of for either model, a CombinedFinalGrade is created or updated (https://github.com/mitodl/micromasters/blob/40deb7afba5cc93b2514fc2dfa050fdb96b1df54/grades/api.py#L299-L326). The combined grade is only at the Course level and it's continuously updated so it's not useful in populating historical grades. It's mentioned here for context and because we'll use a similar algorithm.

So the main problem we face here is that historically when a learner takes an exam and when they take a course run are not guaranteed to intersect however in MITx Online they always will, so we need to figure out a reasonable way to map these over.

I think what is probably the most accurate would be to walk the learner's FinalGrade and ProctoredExamGrade records and compute what their CombinedFinalGrade would've been calculated as at that point in time and give them a grade for the most recent run combining the best of both in MITx Online.

Use Cases

Approach

We need to decide how to create CourseRunGrade records in MITx Online. Since these are rooted in runs we need to make sure that these records match up to enrollments. FinalGrade in MicroMasters follows this convention as well, so that's what we'll end up basing our approach off of.

rachellougee commented 2 years ago

@pdpinch @Ferdi I've run the number for DEDP program grades on production (related to the second use case described in the above issue)

Per discussed with @rhysyngsun last week, we are currently only bringing in DEDP grades in the first use case in the ticket

Learners that have taken at least one course run and one exam

Ferdi commented 2 years ago

There are 40955 learners who have taken at least one course run but never taken the exam

Can we get a distribution by year ?

There are 2264 learners who have paid for at least one course run (enrollment mode is either verified or honor), but never taken the exam

I thought the DEDP courses from edx.org where all "audit" mode. I'm curious to know where "verified" / "honor" comes from. Are these from old days, before we accepted payment on mm.mit.edu ?

rachellougee commented 2 years ago
graded_year Learner count
2017 1,291
2018 766
2019 970
2020 1,701
2021 1,030
2022 746

query used for reference in case I miss something

SELECT EXTRACT(YEAR FROM DEDP_grades.created_on) AS graded_year, COUNT(DISTINCT DEDP_grades.user_id) FROM 
(
    SELECT grade.user_id, grade.course_run_id, courserun.course_id, grade.created_on
    FROM grades_finalgrade as grade
    JOIN dashboard_cachedenrollment AS enrollment ON grade.course_run_id = enrollment.course_run_id AND grade.user_id = enrollment.user_id
    JOIN public.courses_courserun AS courserun ON grade.course_run_id = courserun.id
    JOIN public.courses_course AS course ON course.id = courserun.course_id
    WHERE course.program_id =2 AND grade.passed = true
) AS DEDP_grades
LEFT JOIN 
(
    SELECT examgrade.user_id, examgrade.course_id
    FROM grades_proctoredexamgrade as examgrade 
    JOIN exams_examrun AS exam ON examgrade.exam_run_id = exam.id
    JOIN courses_course  AS course ON exam.course_id = course.id
    WHERE course.program_id =2 
) AS DEDP_exam
  ON DEDP_grades.user_id = DEDP_exam.user_id AND DEDP_grades.course_id =DEDP_exam.course_id
WHERE DEDP_exam.user_id IS NULL
GROUP BY graded_year;
title edx_course_key start_date end_date courseware_backend
Data Analysis for Social Scientists course-v1:MITx+14.310x+3T2016 2016-09-16 15:00 2016-12-18 00:00 edxorg
Foundations of Development Policy: Advanced Development Economics course-v1:MITx+14.740x+3T2016 2016-09-16 04:00 2016-12-11 23:59 edxorg
Foundations of Development Policy: Advanced Development Economics course-v1:MITx+14.74x+3T2015 2015-09-21 15:00 2015-12-10 15:00 edxorg
Microeconomics course-v1:MITx+14.100x+3T2016 2016-09-16 04:00 2016-12-15 15:00 edxorg
The Challenges of Global Poverty course-v1:MITx+14.73x_1+1T2016 2016-02-22 15:00 2016-05-15 15:00 edxorg
The Challenges of Global Poverty course-v1:MITx+14.73x+3T2016 2016-09-16 04:00 2016-12-11 15:00 edxorg
The Challenges of Global Poverty MITx/14.73x_1/1T2015 2015-02-03 15:00 2015-05-03 04:00 edxorg
The Challenges of Global Poverty MITx/14_73x/1T2014 2014-02-04 15:00 2014-05-11 15:00 edxorg

image

pdpinch commented 2 years ago

@rachellougee can you exclude people who have not passed the course from these counts? I think there's a boolean on the final grade model that you could use.

I think that will give us a better picture of how many people in group 2 are likely to want to pay for an exam.

rachellougee commented 2 years ago

@pdpinch I updated my last comment. These numbers are only for people who passed the course but have not taken the exam now

rachellougee commented 2 years ago

There are slack conversations related to enrollment and grade. I will post it here for reference.

I believe what we wanted to do was only bring in enrollments and grades for learners who had taken an exam, does that sound right? Meaning: If a learner has taken at least one exam, we'll bring over course run grades weighted w/ the best exam grade If they haven't taken an exam: If they've paid for the course, the course team will manually import their best legacy grade from MicroMasters into a future course run and the learner will use a coupon to "purchase" that course again. If they've never paid, nothing to do here, they'll need to retake the course and pay on MITx Online to earn credit going forward.

@pdpinch Per discussion with Nathan yesterday, we want to make sure the migrated enrollments have a matching grade. If we are only bringing over grades in group 1, we would want to do the same thing for enrollment. your thought?

Learners that have taken at least one course run and one exam