sipb / hydrant

MIT semester course planning app
https://hydrant.mit.edu/
MIT License
27 stars 18 forks source link

handle arbitrary section kinds #36

Closed reeceyang closed 1 year ago

reeceyang commented 1 year ago

TODO from the comment on this line

https://github.com/sipb/hydrant/blob/3bad537ba0226c08d29901d9bf4110f39801b2bf/scrapers/fireroad.py#L66

Some classes' schedules are broken because Hydrant doesn't handle arbitrary section kinds. For example, 4.021 design studio's sections are called "Design." I listed out all the classes with this issue and they're all classes with "Design" sections (except for 2.00B Toy Product Design, which has the text "Labs offered: 9-12, 2-5, 7-10. Students will choose the time. That is best for their schedu."

Screen Shot 2023-03-10 at 11 47 02 PM

there are few possible fixes. We could add add a Design section kind, which would fix this problem for now (and probably for the forseeable future; tbh I doubt MIT classes are suddenly going to start adding many different section kinds). ofc, this wouldn't "handle arbitrary section kinds."

we could also change sections to be a map mapping the section type as a string to an Array<RawSection>, e.g.:

{
  "Lab":  [[[10, 2], [70, 2]], "34-101"],
  "Design": [[[10, 2], [70, 2]], "34-101"],
  "Some other section kind": [[[10, 2], [70, 2]], "34-101"],
}

it seems like this would involve refactoring more of the code though. another downside(?) is the strings in the objects are longer (which I assume is the purpose of having 1-2 letter strings for all the raw class object keys?).

I'm down to do either this refactoring or the adding "Design" section kind, provided that hydrant maintainers are ok with me messing with the class data type.

reeceyang commented 1 year ago

oh there's also the raw time, which could be stored alongside the Array<RawSection> in the object above but also we could just generate the raw time string dynamically on the frontend

psvenk commented 1 year ago

@reeceyang Thanks for the issue, and for taking the time to write this up so thoroughly.

Others can feel free to chime in, but I think it would be better to add a "Design" section kind while changing the highlighted line of code from continue to something along the lines of print(f"Unknown section kind {name}"), instead of dispensing with the enum and just using strings. That way, in cases such as 2.00B, we would know to add a custom override instead of blindly treating something like "Labs offered: 9-12, 2-5, 7-10. Students will choose the time. That is best for their schedu." as a section kind (and thereby, in this case, ignoring the actual lab timings of 2.00B).

psvenk commented 1 year ago

Also, the Data Warehouse has separate fields HAS_$x_SECTION for $x ∈ {LECTURE, RECITATION, LAB, DESIGN}, so there shouldn't be any other section kinds.

reeceyang commented 1 year ago

cool, sounds like adding in the "Design" section kind is the way to go. I can work on the refactor if you're able to assign me to this issue?

For the 2.00B override, would that just be adding a check in scrapers/fireroad.py and hardcoding in the times?

psvenk commented 1 year ago

Sure; thanks. The 2.00B override should be possible to do from the FireRoad server side (tracking issue https://github.com/venkatesh-sivaraman/fireroad-server/issues/45); I'll work on implementing an override in the next few days or so. However, in general, scrapers/package.py has Hydrant-specific overrides (hopefully the current 2.00B Hydrant override will be made unnecessary as well).

reeceyang commented 1 year ago

keeping in line with the 1-2 character naming convention, 'e' and 'g' are still available out of all the letters in "design" (and then "er" and "gr" for the raw times). is there any preference for which one to use?

cjquines commented 1 year ago

we're past semester crunch time, so it might be good to refactor the thing to be more verbose, although idk if now is an appropriate time

reeceyang commented 1 year ago

yeah i could also anticipate the change and name it "designSections" or something