morinted / schedule-generator

A schedule generator for the University of Ottawa written in Java, using OCSF.
63 stars 19 forks source link

Should do some sort of automated testing on scraped schedules #47

Closed davidschlachter closed 4 years ago

davidschlachter commented 4 years ago

Course updates frequently break in ways that are currently only detected by either my very incomplete manual testing, or by bug reports from users. If there were some way that we could do automated testing to validate scraped schedules, this would improve the program's reliability for users.

However, the main challenge would be to validate schedules without scraping validation data with the same system scraping the test data.

One possible solution could be to detect situations that are known to cause silent failures. For example, if two sections have the following activites:

This would indicate a scraping error in determining sections from the uOttawa data and will cause silent failures for users. This type of situation could be automatically flagged and trigger a notification that scraping has had some failures. These types of errors should be tested for and raised in the scraping program since this is now the most common failure point for users of this project.

davidschlachter commented 4 years ago

If this is implemented by somehow comparing scraped schedules to reference schedules, then the courses most commonly searched for would probably be the best candidates. Here are the top ones from April 2018 – April 2020 (with the number of times searched):

 496 CEG2136
 451 CSI2110
 411 SEG2105
 392 ENG1112
 371 MAT1320
 331 MAT1322
 331 CHM1311
 321 MAT1341
 316 ITI1120
 316 ECO1104
 285 MAT2377
 260 PHI1101
 239 ECO1102
 214 ENG1100
 211 ADM1340
 205 MAT1348
 204 ITI1121
 194 ITI1100
 192 ECO1504
 190 PSY1102
 190 ECO1502
 189 CSI2132
 177 PSY1101
 172 MAT1300
 172 CSI2101
davidschlachter commented 4 years ago

Implemented in 58a1c389175abfb9dd89dee6950e758e798eb614 after email conversation with uschedule.me team. Each time schedules are updated, the test results will be available at https://schlachter.ca/schedgen/latest-unit-test-results.txt.