Create common lecture update handling.

adamcik commented 14 years ago

Currently the db and web scrapes have a lot of more or less duplicated code with respect to syncing lecture data. The db version is currently in better shape and does a better job of handling "duplicate" lectures like EDU3077 on Wednesdays during spring 2010. Correct handling of such cases will require distinguishing lectures on rooms and weeks in addition to groups as the db code does today. The web code on the other hand simply dies when it finds a dupe.

To simplify maintenance and fix the dupe handling for both scrapers the db scraper should be rewritten to create a intermediate data-structure like the web scraper all ready does and they should share a common function for handling this data-structure. An added bonus is that this function will actually be testable :-)

adamcik commented 14 years ago

The same idea can probably be extended to course syncing as well.

adamcik commented 14 years ago

Current implementation in db scraper almost works well enough. Current problem is that it is not stable in the sense that which dupe gets which set of room/lecturer/weeks varies.

adamcik commented 14 years ago

Both scrapers now use the same function to process data. Now it just needs to become "stable" with respect to dupes.

adamcik commented 14 years ago

086dbb0cfcbbe5c36ce0ad890ff04a61a0c71507 makes the ordering consistent so that the processing is more stable. For a complete fix more variables than groups should be considered when checking for equality, but this is out of scope for this issue.

adamcik / plan

Create common lecture update handling. #40