mitodl / open-discussions

BSD 3-Clause "New" or "Revised" License
10 stars 2 forks source link

Import courses from https://prolearn.mit.edu/ #3787

Closed mbertrand closed 1 year ago

mbertrand commented 1 year ago

Use the URL https://prolearn.mit.edu/graphql to request the course info to be imported.

Request body:

{"query":"\n      query {\n        searchAPISearch(index_id:\"default_solr_index\", range:{limit: 999, offset: 0}) {\n          result_count\n          documents {\n          ... on DefaultSolrIndexDoc {\n              title\nnid\nurl\ncertificate_name\ncourse_application_url\ncourse_link\nfield_course_or_program\nstart_value\nend_value\ndepartment\ndepartment_url\nbody\nbody_override\nfield_time_commitment\nfield_duration\nfeatured_image_url\nfield_featured_video\nfield_non_degree_credits\nfield_price\nfield_related_courses_programs\nrelated_courses_programs_title\nfield_time_commitment\nucc_hot_topic\nucc_name\nucc_tid\napplication_process\napplication_process_override\nformat_name\nimage_override_url\nvideo_override_url\nfield_new_course_program\nfield_tooltip\n            }\n          }\n        }\n      }\n    "}
mbertrand commented 1 year ago

Prolearn returns courses & programs from the following sources:

 'MIT Bootcamps': 3,
 'MIT Professional Education': 181,
 'MIT xPRO': 52,
 'MIT Sloan Executive Education': 109,
 'MIT Schwarzman College of Computing': 1,
 'MIT Center for Transportation & Logistics': 1,
 'MIT CSAIL': 1

(only xpro returns programs, for now)

I'm assuming that xPro results should be ignored since we are importing them directly from its own API.

For each result, there is a Prolearn URL and 1-2 source URL's. For example:

  'url': '/building-great-teams-enterprise-leaders-playbook',
  'course_application_url': 'https://executive.mit.edu/course/a056g00000URaacAAD.html',
  'course_link': 'https://executive.mit.edu/course/a056g00000URaacAAD.html',

Which URL should be stored as the URL we link to from open?

Topics returned by ProLearn include:

 'AI/Machine Learning',
 'BioTech',
 'Blockchain',
 'Business Analytics',
 'Business Innovation',
 'Data Science',
 'Digital Business/IT',
 'Educators',
 'Entrepreneurship',
 'Finance',
 'Leadership & Organizations',
 'Management',
 'Manufacturing',
 'Marketing & Communications',
 'Operations',
 'Other Business',
 'Other Engineering',
 'Real Estate',
 'Strategy',
 'Sustainability',
 'Systems Engineering',
 'Systems Thinking',
 'Tech Management & Business',
 'Technology Innovation'

Keep them as is or map them to another set of topics?

Ferdi commented 1 year ago

I'm assuming that xPro results should be ignored since we are importing them directly from its own API.

ok

Which URL should be stored as the URL we link to from open?

Which URL do we use when we import from other sources (like edx)? is there anyway to keep both urls or it's hardcoded in the schema

Keep them as is or map them to another set of topics?

We need a mapping csv file

mbertrand commented 1 year ago

"MicroMasters" and "MITx" seem to link directly to the edx urls, so I guess to be consistent we could use course_link or course_application_url (often they are the same, but not always).

All the urls will be stored along with the rest of the api response in the raw_json database field for a course, but only one can be assigned to the url field and linked to in the frontend.

mbertrand commented 1 year ago

We need a mapping csv file

I think they are UCC topics, we have a csv mapping file for that already.