mitodl / ol-data-platform

Pipeline definitions for managing data flows to power analytics at MIT Open Learning
BSD 3-Clause "New" or "Revised" License
36 stars 6 forks source link

Extract openedx course metadata #1172

Closed quazi-h closed 2 months ago

quazi-h commented 2 months ago

What are the relevant tickets?

https://github.com/mitodl/hq/issues/4067

Description (What does it do?)

Extends the openedx lib in our Dagster pipelines to process the course bundle metadata. The new code will find the course metadata xml file using the run tag parsed from courses.xml From there, the handful of metadata attributes will be parsed from the file and written to a json file, similar to how the video xml data is being handled in the same pipeline.

How can this be tested?

Once deployed to QA, should test the new pipeline by kicking it off in Dagster.

Checklist:

blarghmatey commented 2 months ago

Once the above comments are addressed this should be good to merge