kshitij10496 / hercules

The mighty hero helping you build projects on top of IIT Kharagpur's academic data
https://hercules-10496.herokuapp.com/api/v1/static/index.html
MIT License
34 stars 18 forks source link

Storing historical data from the ERP #47

Open icyflame opened 5 years ago

icyflame commented 5 years ago

This was discussed during Demo Day 10 on 9th November, 2018

Some of the data on the ERP has historical value. Some data (eg: grades) is very obviously useful historically, and should be a part of the present semester's data.

There is some other auxilliary information that would be useful for applications that might be developed later. Eg: Timetables, historical slot information.


Three components are required for the storage and retrieval of historical data to be frictionless:

  1. A method to retrieve the data through an API endpoint that uses a unique semester identifier

There is no semester specific code. Giving the identifier (say) /2018FALL/timetable/MA20001 will get you the data from that semester. Whereas /timetable/MA20001 would get you the data from the ongoing semester.

  1. A way to make present data historical and create empty present data

These (scripts?) make the movement of data between present and history easier. Once this is implemented, when 2018FALL is over, these two scripts will send the data in the "PRESENT" to "2018FALL" and create a new set of "PRESENT" data that is empty and would refer to the next semester "2019SPRING".

Because of (1), as soon as the data is moved to "2018FALL" specific data containers, the endpoints for historical data of that semester will magically start working! (Not magical as we have implemented it, but it would be magical because of how little you had to do to get it started)

  1. A way to swap out the data container that stores historical data

As @kshitij10496 put aptly in the demo day call, Heroku's free plans have restrictions. So, we need to be flexible about where we store the historical data. A crude solution is to store them in a JSON file. Another solution is to put a MongoDB or ElasticSearch instance or Postgres server on the metakgp main server.

Whatever be the case, the data layer should be easily swappable to be anything.


When the above three pieces are in place, we can start reliably providing historical data. We will also be able to provide this data quickly when a semester ends (because of (2)). And we will be able to adapt to changing conditions, cheaper services, new technology (because of (3))