Yuyu-Ren / mcgill-course-map

Discover McGill: a graph of interrelated courses at McGill. This fork adds the support for retrieving the prerequisites of a given course (plus some fancy color gradient based on the depth of each node.)
https://course-map-banana.api.tianshome.com/
GNU General Public License v3.0
0 stars 1 forks source link

course_spider.py does not pull coreq/prereq correctly #1

Open Yuyu-Ren opened 3 years ago

Yuyu-Ren commented 3 years ago

Take a look at COMP350 for example. It has "And one of" with two ands. There are other courses which have "ORs" as prerequisites or corequisites but the current logic parses them as if there were multiple. So something with "COMP250 or COMP202 or COMP208" for example ends up in the dump as: "COMP250, COMP202, COMP208" as a requirement which is not the right behavior.

Yuyu-Ren commented 3 years ago

We can try and represent the prereq and coreq fields with the following schema perhaps?

All elements within are AND together: [COURSE1, COURSE2, OR [COURSE3, COURSE4, COURSE5]]

Yuyu-Ren commented 3 years ago

We can try the following xpath: //li[contains(p, 'Corequisite(s)']//text()") But that only retrieves the keywords so it might not be very helpful to us.

Yuyu-Ren commented 3 years ago

//li[contains(p, 'Prerequisite')] works as is to dump the links and the text too. I realized that the current business logic rips out the links and strips them for the course code