mitodl / ol-data-platform

Pipeline definitions for managing data flows to power analytics at MIT Open Learning
BSD 3-Clause "New" or "Revised" License
36 stars 6 forks source link

fix dbt error on mart combined users #1197

Closed rachellougee closed 1 month ago

rachellougee commented 2 months ago

What are the relevant tickets?

NA

Description (What does it do?)

This fixes the dbt warnings and errors in today's run https://pipelines.odl.mit.edu/runs/3a4d8047-b1c8-4523-90b0-16e7b6b896f0

Failure in test not_null_marts__combined__users_platforms (models/marts/combined/_marts__combined__models.yml)
4:45:41.689 AM
run_dbt_2db87
ERROR
  Got 360 results, configured to fail if >10

There are some micromasters users added to intmitxusers but they don't have any social auth account to mitxonline or edxorg so the platform is null for these users. I think we should filter them out in the mart until we know what to do with these users. But let me know if you disagree.

How can this be tested?

dbt build --select martscombinedusers

Additional Context

pdpinch commented 1 month ago

I don't understand -- did you say we have an email for them, but no socialauth for edx.org or for mitxonline?

KatelynGit commented 1 month ago

I don't understand -- did you say we have an email for them, but no socialauth for edx.org or for mitxonline?

I'm seeing micromasters emails for the folks excluded by this change yes. To pull examples: select * from intmitxusers where user_mitxonline_email is null and user_edxorg_email is null

rachellougee commented 1 month ago

So It looks like a lot of these folks do have a micromasters email address even though they don't have a mitxonline or edxorg one.

Yes, these forks do have MicroMasters email. But MicroMasters is not a platform and currently their email is pulled from MITx Online or edxorg in the combined users mart, see https://github.com/mitodl/ol-data-platform/blob/main/src/ol_dbt/models/marts/combined/marts__combined__users.sql#L43-L70.

I don't understand -- did you say we have an email for them, but no socialauth for edx.org or for mitxonline?

As mentioned above, these users do have MicroMasters email but they either don't have any socialauth at all or we can't link them to MITx Online or edx.org based on their socialauth, so their email and platform are blank, which is why we got these warning and error because I recently added these MicroMasters users to int__mitx__users https://github.com/mitodl/ol-data-platform/blob/main/src/ol_dbt/models/intermediate/mitx/int__mitx__users.sql#L164-L198 which then flowed into this mart.

If we do want to include these users in the mart, then we need to fix their email and their profile fields to pull from MicroMasters. But I am unsure what platform these users belongs to for those who have no socialauth at all

rachellougee commented 1 month ago

I updated only to filter out roughly 360 micromasters users who don't have any social auth accounts. That's much less than what was before.