mitodl / mit-open

BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

Several ETL bugs #162

Closed mbertrand closed 10 months ago

mbertrand commented 10 months ago

Expected Behavior

Prolearn: does not import xpro courses or programs (handled by separate pipeline) Micromasters: gracefully handles programs with blank url's (present in micromaster RC api response but not production)

Current Behavior

Prolearn: xpro courses/programs are imported. Micromasters: ETL pipeline errors out when reading API data from RC

Steps to Reproduce

Prolearn: Run ./manage.py backpopulate_prolearn_data Micromaster: RUn ./manage.py backpopulate_micromasters_data

Possible Solution

Prolearn: Adjust prolearn api query to omit xpro Micromasters: Adjust pipeline to expect potentially null urls

ChristopherChudzicki commented 10 months ago

Micromasters: gracefully handles programs with blank url's (present in micromaster RC api response but not production)

Is that RC data bad? can we just delete it?

mbertrand commented 10 months ago

I believe it is bad data and should be fixed on the micromasters side, but ideally the pipeline should be able to continue on to the next item when bad data is encountered.