mitodl / mitxonline

BSD 3-Clause "New" or "Revised" License
4 stars 2 forks source link

Missing MITxT course enrollments in postgres #1597

Closed rachellougee closed 1 year ago

rachellougee commented 1 year ago

There are over 2k MITxT course enrollments that only exist in MITx Online open edx, not in the Postgres database. Attached is the file for reference, this list is limited to enrollments that are closed https://docs.google.com/spreadsheets/d/1T6BDgurf7ngMV8Ool0-AGrc36AZjhyoi/edit#gid=1218501753

This is discovered when comparing our MITxT course enrollments with the Tableau report at https://github.com/mitodl/hq/issues/1316. For MITxT course enrollments, our data comes directly from the MITx Online app, but in the Tableau report, IRx uses our MITx Online open edx data, which causes the discrepancies.

This is for investigation, I don't know how much it involves to address the issue if this is a bug in our sync process. @pdpinch FYI

pdpinch commented 1 year ago

@rachellougee can you double check "not in the Postgres database"?

I started poking around a little bit, and there are some enrollments I can't find, like the one in row 2.

https://docs.google.com/spreadsheets/d/1plJzS52q_RqncFwWjwxJNAe-5x9VJO5euR5RMtr7Bw4/edit#gid=0&range=A2:C2

I suspect this user is an instructor, and they were enrolled manually. This is an enrollment "hole" that we should close, see https://github.com/mitodl/hq/issues/1095

But, these all seem to be there:

https://docs.google.com/spreadsheets/d/1plJzS52q_RqncFwWjwxJNAe-5x9VJO5euR5RMtr7Bw4/edit#gid=0&range=A103:C107

in the django admin:

https://mitxonline.mit.edu/admin/courses/courserunenrollment/?q=yoshinee

rachellougee commented 1 year ago

@pdpinch The example you highlighted https://docs.google.com/spreadsheets/d/1plJzS52q_RqncFwWjwxJNAe-5x9VJO5euR5RMtr7Bw4/edit#gid=0&range=A103:C107 are due to sync timing issue between open edx and MITx Online db in our data lake. Our open edx DB sync finished 3 hours ago, and MITxOnline app sync finished 14 hours ago. Since this user's enrollments are all happening today, so my list might not be accurate for course enrollments that are still open.

pdpinch commented 1 year ago

Can you do a version of your "missing enrollments" sheet that is limited to courses where enrollments have closed?

rachellougee commented 1 year ago

https://docs.google.com/spreadsheets/d/1T6BDgurf7ngMV8Ool0-AGrc36AZjhyoi/edit I added a second tab that only includes closed enrollment.

Also added the instructor indicator to the list, there are 8 instructor enrollments, Not sure what else is going on.

pdpinch commented 1 year ago

Note, when the user loads their dashboard, any missing enrollments will get fixed. When I started investigating this, eventually every user I looked at was enrolled -- because I hijacked their accounts.

I still don't understand how these enrollments exist in open edX but not at all in mitxonline.

pdpinch commented 1 year ago

@annagav can you take a look at this?

rachellougee commented 1 year ago

I've updated https://docs.google.com/spreadsheets/d/1T6BDgurf7ngMV8Ool0-AGrc36AZjhyoi/edit#gid=435252790, there are some new records added. I don't know why these missing enrollments didn't get fixed as some users logged in MITx Online app after enrollments, maybe they didn't go to their dashboard?

Seems like sync_enrollments command could be used to add these missing enrollments, but there are over 1k users that need to be repaired, not sure it's efficient to use this command or tweak it so that we can do batch sync.

As for monitoring this, we could implement something in the data platform since it's easy to access the open edx and MITx Online database and query the discrepancies in our data lake, but there might be a better way to implement it in the app.

rachellougee commented 1 year ago

I am thinking of adding an option to pass a list of course IDs as a parameter to sync_enrollments to create these missing enrollments in bulk, there are 47 distinct course IDs, so it's a lot less than over 1k distinct users.

@jkachel @pdpinch Do you have any concerns?

rachellougee commented 1 year ago

Still needs to ran the command once code is deployed to production

rachellougee commented 1 year ago

I am waiting to get a refreshed list of enrollment discrepancies between openedx and mitxonline app once our data lake data is up-to-date, so that I know a list of course IDs to fix. Besides that, it looks that learners can directly enroll from https://courses.mitxonline.mit.edu/learn/course/course-v1:MITxT+JPAL101x+3T2022/home without going to the MITx Online app. I was able to reproduce it by enrolling in this link, my enrollment stayed in open edx and didn't sync to app until I manually go to my dashboard.

rachellougee commented 1 year ago

I've run an updated list of discrepancies. Comparing it to the list I generated 3 weeks ago, there is only one enrollment increase and some of the missing enrollments fixed themselves when learners loaded their dashboard. But there are still around 2948 enrollments (48 distinct course IDs) that need to be fixed/created in MITx Online. I will run ./manage.py create_local_enrollments --runs today

rachellougee commented 1 year ago

Here are the outputs of running manage.py create_local_enrollments --runs to create these missing enrollments on MITx Online

        81 Enrollments created for course-v1:MITxT+JPAL102x+3T2021 
    84 Enrollments created for course-v1:MITxT+14.310x+3T2021 
    1 Enrollments created for course-v1:MITxT+14.310PEx+2T2022 
    3 Enrollments created for course-v1:MITxT+8.01.3x+3T2022 
    5 Enrollments created for course-v1:MITxT+14.310PEx+1T2022a 
    134 Enrollments created for course-v1:MITxT+14.740x+2T2022 
    2 Enrollments created for course-v1:MITxT+24.118x+2T2022 
    5 Enrollments created for course-v1:MITxT+14.73PEx+1T2022 
    1 Enrollments created for course-v1:MITxT+24.10x+3T2022 
    7 Enrollments created for course-v1:MITxT+8.S50.1x+3T2022 
    2 Enrollments created for course-v1:MITxT+14.750PEx+1T2022 
    3 Enrollments created for course-v1:MITxT+8.01.4x+3T2022 
    2 Enrollments created for course-v1:MITxT+3.034.1x+1T2023 
    14 Enrollments created for course-v1:MITxT+JPAL101x+3T2022 
    20 Enrollments created for course-v1:MITxT+21A.819.2x+3T2021 
    92 Enrollments created for course-v1:MITxT+JPAL102x+1T2022 
    43 Enrollments created for course-v1:MITxT+14.73x+3T2021 
    2 Enrollments created for course-v1:MITxT+JPAL102PEx+1T2022a 
    110 Enrollments created for course-v1:MITxT+14.750x+1T2023 
    132 Enrollments created for course-v1:MITxT+14.73x+1T2023 
    148 Enrollments created for course-v1:MITxT+14.740x+1T2023 
    6 Enrollments created for course-v1:MITxT+14.310PEx+1T2022 
    4 Enrollments created for course-v1:MITxT+14.100PEx+1T2022 
    1 Enrollments created for course-v1:MITxT+8.04x+3T2022 
    168 Enrollments created for course-v1:MITxT+14.100x+3T2022 
    85 Enrollments created for course-v1:MITxT+14.009x+2T2022 
    136 Enrollments created for course-v1:MITxT+14.100x+3T2021 
    59 Enrollments created for course-v1:MITxT+14.73x+1T2022 
    102 Enrollments created for course-v1:MITxT+14.310x+1T2022 
    24 Enrollments created for course-v1:MITxT+21A.819.1x+3T2021 
    4 Enrollments created for course-v1:MITxT+8.01.1x+3T2022 
    1 Enrollments created for course-v1:MITxT+8.01.2x+3T2022 
    3 Enrollments created for course-v1:MITxT+15.699x+3T2022 
    2 Enrollments created for course-v1:MITxT+24.09x+2T2022 
    134 Enrollments created for course-v1:MITxT+14.750x+3T2022 
    291 Enrollments created for course-v1:MITxT+14.100x+2T2022 
    151 Enrollments created for course-v1:MITxT+14.310x+2T2022 
    4 Enrollments created for course-v1:MITxT+JPAL101SPAx+1T2023 
    6 Enrollments created for course-v1:MITxT+14.100PEx+2T2022 
    160 Enrollments created for course-v1:MITxT+JPAL102x+3T2022 
    265 Enrollments created for course-v1:MITxT+JPAL102x+1T2023 
    70 Enrollments created for course-v1:MITxT+14.750x+3T2021 
    69 Enrollments created for course-v1:MITxT+14.310x+3T2022 
    288 Enrollments created for course-v1:MITxT+14.310x+1T2023 
    1 Enrollments created for course-v1:MITxT+JPAL102PEx+1T2022 
    2 Enrollments created for course-v1:MITxT+14.73PEx+1T2022a 
    87 Enrollments created for course-v1:MITxT+14.73x+3T2022 
    7 Enrollments created for course-v1:MITxT+14.740PEx+2T2022 

@pdpinch FYI. This should be fixed in production now.

pdpinch commented 1 year ago

How can we monitor this going forward?

I know we have some enrollment "side doors" that need to be closed to stop this from happening in the future. I opened an issue for looking into that, https://github.com/mitodl/hq/issues/1637

rachellougee commented 1 year ago

For monitoring the enrollment discrepancies between open edx and app, we can expose it on BI and get an alert for it. We will need to add open edx enrollments to intermediate in the data platform first. @pdpinch Please let me know your thought on this

If we want to monitor it from the MITx Online side, maybe implement something after the user login? That might be too noisy, and possibly won't work as expected before we close that "side doors". @jkachel do you have any suggestions on how to monitor it on MITx Online side?

jkachel commented 1 year ago

It might be best to do this in two places -

  1. As noted, loading the dashboard triggers an enrollment data sync from edX. (Specifically, it's the user enrollments list API that does it.) The underlying openedx api call could be amended to emit a log message if it's found enrollments on the edX side. This would only cover some of the cases, though; users don't always hit the dashboard. We'd also want to expand the api function itself so that we can turn off the reporting if we're running it manually (there's a management command to do it; if I'm actively doing it I don't need to bother Sentry about it).
  2. We can add in a sweep task to check for enrollments overnight. edX does have a "get everything" enrollment API that we could use for this purpose. We can then emit a log message or send a report or something when it finds things that aren't there already (and then also fix the enrollments, of course).
rachellougee commented 1 year ago

Closing it as monitoring this going forward will be addressed in https://github.com/mitodl/hq/issues/1637