Closed adamcik closed 14 years ago
The same idea can probably be extended to course syncing as well.
Current implementation in db scraper almost works well enough. Current problem is that it is not stable in the sense that which dupe gets which set of room/lecturer/weeks varies.
Both scrapers now use the same function to process data. Now it just needs to become "stable" with respect to dupes.
086dbb0cfcbbe5c36ce0ad890ff04a61a0c71507 makes the ordering consistent so that the processing is more stable. For a complete fix more variables than groups should be considered when checking for equality, but this is out of scope for this issue.
Currently the db and web scrapes have a lot of more or less duplicated code with respect to syncing lecture data. The db version is currently in better shape and does a better job of handling "duplicate" lectures like EDU3077 on Wednesdays during spring 2010. Correct handling of such cases will require distinguishing lectures on rooms and weeks in addition to groups as the db code does today. The web code on the other hand simply dies when it finds a dupe.
To simplify maintenance and fix the dupe handling for both scrapers the db scraper should be rewritten to create a intermediate data-structure like the web scraper all ready does and they should share a common function for handling this data-structure. An added bonus is that this function will actually be testable :-)