ryanhugh / searchneu

Search over Classes, Professors and Employees at NEU!
https://searchneu.com
GNU Affero General Public License v3.0
74 stars 18 forks source link

Historical data analysis #53

Open edward-shen opened 6 years ago

edward-shen commented 6 years ago

A friend of mine asked to see if we could do some historical analysis and indicate on each course when they offer it. This is useful because it allows users to see and plan our their schedule, especially for courses that are only historically offered in the spring/fall.

We have banner information since 2012, so we should be able to use that.

I was thinking of something like this: We calculate the occurrences of when that class was offered, e.g. if a class was offered 5 times in the last 6 years in the fall, we'd add "Generally offered in the fall (83.3% of the time)." If a class was offered maybe twice in the last 6 years in the summer, and always in the spring, we'd have (Historically always offered in the Spring, rarely offered in the summer (33.3% of the time)."

Alternatively, we could have a table that shows each percentage.

We'd effectively add 4-6 new fields for every course, one for each semester.

This poses some initial problems: Will there be 3 spots for summer courses (Full summer, Summer I, Summer II)? We'd likely need to cache banner data, because with the new API I have a feeling they're removing accessible data from the past.

edward-shen commented 6 years ago

Solving #24 would also probably help with this.

ryanhugh commented 6 years ago

Sounds good to me! Once we add the ability to re-use data that we have already scraped we can load in all the data all the way back to 2012 and have everything available for processing. Lets finish #24 before this.