Open GuyWhoCode opened 5 months ago
Discord Name(s): .thedaniel, saltbagels
Why is ML necessary? Can't you query and regex all pre-reqs then build a graph?
Why is ML necessary? Can't you query and regex all pre-reqs then build a graph?
@392781 Luciano recommended using ML because of all the edge cases that come with parsing all the pre-reqs. Although so far, the person assigned to this issue has not used ML.
CHM 123 or CHM 1220 ; and CHM 123L or CHM 1220L ; or concurrent enrollment in CHM 2010 .
AMM 360 or AMM 3600 ; and AMM 360L or AMM 3600L .
ACC 207 and 207A or ACC 2070 ; and CIS 101, CIS 1010 , or PCPT.
BIO 121/BIO 121L, BIO 122/BIO 122L, and BIO 123/BIO 123L; BIO 121/BIO 121L and BIO 122/BIO122L/BIO 1220C; BIO 121/BIO 121L/BIO 1210B and BIO 1220 / BIO 1220L ; or BIO 1210 / BIO 1210L and BIO 1220 / BIO 1220L .
I don't think it's that difficult to use regex queries if you exclude support for 3-digit course names, right? Correct me if I'm wrong, but 3-digit course names were deprecated over 5 years ago.
I didn't use regex but I just kept every String that started with a department name such as BIO or CS and preceeded with a number like 121 or 1210 and then parsed all of the / , and . from the string. I also got rid of all the deprecated courses from creating an array of all the current classes that exist and removing it from my scraped data if it wasn't a part of the other list. I manually checked it with about 10 differnet majors and it worked like 95% of the time
95% of the time it works 100% of the time.
Spoken like a true PHD Statistics candidate.
95% of the time it works 100% of the time. That's good enough for me lmao
\w{2,3}\s\d{3,4}\w?
###################
\w{2,3} # grabs alphabetical letters of length 2 or 3 (course major)
\s # grabs a space
\d{3,4} # grabs 3 or 4 digit course number
\w? # grabs 0 or 1 alphabetical letter for lab/discussion/activity
This would grab all 3 and 4 digit courses (including labs/discussions/activities). The issue from there would be to get the connective material in between... Would probably be best to parse in "levels" where you perform string split on different tokens... so first maybe split on ";" which will create a list of individual requirements then split on "and"/"or" to get down to fine grained requirements.
(Also not yet a candidate :^) but thank you)
Added a breakdown of regex
User Story
As a developer, I want to standardize course pre-requisite information in a easy-to-parse format to render a course pre-req chart.
Technical Tasks
Acceptance Criteria
Note
The end goal of this task is in preparation to generate the below image for every single major.
https://codesandbox.io/p/sandbox/busy-snow-5ss3hl?file=%2Findex.js