mitodl / ocw-data-parser

A parsing script for MIT OpenCourseWare course data
0 stars 0 forks source link

course features missing from parsed json #46

Open pdpinch opened 4 years ago

pdpinch commented 4 years ago

in the raw json, at https://s3.amazonaws.com/ocw-content-storage/PROD/1/1.011/Spring_2011/1-011-project-evaluation-spring-2011/0/1.json there are five course additional features, including two "Assignments":

    "feature_requirements": [
        {
            "ocw_feature": "Assignments", 
            "ocw_subfeature": "problem sets (no solutions)", 
            "ocw_feature_url": "/courses/civil-and-environmental-engineering/1-011-project-evaluation-spring-2011/assignments/", 
            "ocw_feature_notes": "", 
            "ocw_speciality": ""
        }, 
        {
            "ocw_feature": "Exams", 
            "ocw_subfeature": "No solutions", 
            "ocw_feature_url": "/courses/civil-and-environmental-engineering/1-011-project-evaluation-spring-2011/exams/", 
            "ocw_feature_notes": "", 
            "ocw_speciality": ""
        }, 
        {
            "ocw_feature": "Lecture notes", 
            "ocw_subfeature": "Selected", 
            "ocw_feature_url": "/courses/civil-and-environmental-engineering/1-011-project-evaluation-spring-2011/lecture-notes/", 
            "ocw_feature_notes": "", 
            "ocw_speciality": ""
        }, 
        {
            "ocw_feature": "Projects", 
            "ocw_subfeature": "Examples", 
            "ocw_feature_url": "/courses/civil-and-environmental-engineering/1-011-project-evaluation-spring-2011/projects/", 
            "ocw_feature_notes": "", 
            "ocw_speciality": ""
        }, 
        {
            "ocw_feature": "Assignments", 
            "ocw_subfeature": "written (no examples)", 
            "ocw_feature_url": "/courses/civil-and-environmental-engineering/1-011-project-evaluation-spring-2011/assignments/", 
            "ocw_feature_notes": "", 
            "ocw_speciality": ""
        }
    ], 

However, if the output master.json, there are only 4 course features:

  "course_features": [
    {
      "ocw_feature": "Assignments", 
      "ocw_subfeature": "written (no examples)", 
      "ocw_feature_url": "./resolveuid/33dce1c1af39ae75989e6f83a9b72843", 
      "ocw_speciality": "", 
      "ocw_feature_notes": ""
    }, 
    {
      "ocw_feature": "Exams", 
      "ocw_subfeature": "No solutions", 
      "ocw_feature_url": "./resolveuid/5d340f30969a28be1f3ef60996c1f18b", 
      "ocw_speciality": "", 
      "ocw_feature_notes": ""
    }, 
    {
      "ocw_feature": "Lecture notes", 
      "ocw_subfeature": "Selected", 
      "ocw_feature_url": "./resolveuid/3a42244554334f71bf0320890ea05b4a", 
      "ocw_speciality": "", 
      "ocw_feature_notes": ""
    }, 
    {
      "ocw_feature": "Projects", 
      "ocw_subfeature": "Examples", 
      "ocw_feature_url": "./resolveuid/314b9f96fe11689cdf83246d3d760637", 
      "ocw_speciality": "", 
      "ocw_feature_notes": ""
    }
  ], 
pdpinch commented 2 years ago

Another example is 11-127j-spring-2015, compare the raw JSON s3://ocw-content-storage/PROD/11/11.127/Spring_2015/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015/0/1.json

    "feature_requirements": [
        {
            "ocw_feature": "Projects", 
            "ocw_subfeature": "Examples", 
            "ocw_feature_url": "/courses/urban-studies-and-planning/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015/student-games/", 
            "ocw_speciality": "", 
            "ocw_feature_notes": ""
        }, 
        {
            "ocw_feature": "Instructor Insights", 
            "ocw_subfeature": "", 
            "ocw_feature_url": "/courses/urban-studies-and-planning/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015/instructor-insights/", 
            "ocw_speciality": "", 
            "ocw_feature_notes": ""
        }, 
        {
            "ocw_feature": "This Course at MIT", 
            "ocw_subfeature": "", 
            "ocw_feature_url": "/courses/urban-studies-and-planning/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015/instructor-insights", 
            "ocw_speciality": "", 
            "ocw_feature_notes": ""
        }, 
        {
            "ocw_feature": "Assignments", 
            "ocw_subfeature": "(no examples)", 
            "ocw_feature_url": "/courses/urban-studies-and-planning/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015/assignments/", 
            "ocw_speciality": "", 
            "ocw_feature_notes": ""
        }

to the parsed JSON s3://open-learning-course-data-production/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015/11-127j-computer-games-and-simulations-for-education-and-exploration-spring-2015_parsed.json

  "course_feature_tags": [
    {
      "ocw_feature_url": "./resolveuid/7c3cdcb88ea17296cda3ec9241ee1af9", 
      "course_feature_tag": "Projects with Examples"
    }
  ], 

In the parsed json, you can see the course_features dictionary is complete:

  "course_features": [
    {
      "ocw_feature": "Projects", 
      "ocw_subfeature": "Examples", 
      "ocw_feature_url": "./resolveuid/7c3cdcb88ea17296cda3ec9241ee1af9", 
      "ocw_feature_notes": "", 
      "ocw_speciality": ""
    }, 
    {
      "ocw_feature": "Instructor Insights", 
      "ocw_subfeature": "", 
      "ocw_feature_url": "./resolveuid/19b1d4162c63bbbee939b26b6a767bda", 
      "ocw_feature_notes": "", 
      "ocw_speciality": ""
    }, 
    {
      "ocw_feature": "This Course at MIT", 
      "ocw_subfeature": "", 
      "ocw_feature_url": "./resolveuid/19b1d4162c63bbbee939b26b6a767bda", 
      "ocw_feature_notes": "", 
      "ocw_speciality": ""
    }, 
    {
      "ocw_feature": "Assignments", 
      "ocw_subfeature": "(no examples)", 
      "ocw_feature_url": "./resolveuid/cf36f31f70ddf7ad52e8e74cc8afb65d", 
      "ocw_feature_notes": "", 
      "ocw_speciality": ""
    }
  ], 
mbertrand commented 2 years ago

@pdpinch @gumaerc I think this is deliberate for the most part: course features are intended to be unique (so "Assignments" only shows up once for example), and course feature tags are determined based on a mapping. In the example above, "This Course At MIT" and "Instructor Insights" map to TAG_NONE. The one potential bug is that a feature of "Assignments" with subfeature "activity (no examples)" should map to TAG_ACTIVITY_ASSIGNMENTS, but here it is just "(no examples)" so it doesn't map to any tag.