TheTransitClock / transitime

TheTransitClock real-time transit information system
GNU General Public License v3.0
78 stars 29 forks source link

What's your approach on managing database size (disk usage)? #252

Closed BodoMinea closed 2 years ago

BodoMinea commented 2 years ago

Not necessarily an issue but the two instances I am running use up a lot of database space due to historical AVL records.

Is there some mechanism to get transitime to clean its own old data?

Or should I just write my own cron to delete older data? If this manual approach is the way to go, I am curious if I should first shut down the core or it shouldn't really matter and what would be your advice on how much data to keep in the database? Are the avl records / vehicle states used for actual predictions (and removing them would decrease accuracy) or are they solely used on the fly to calculate stuff and after that only accessed if specifically requested through an avl record request dated in the past?

Thank you very much for your input!

scrudden commented 2 years ago

Hi,

The records in the database are not used for predictions except for ArrivalsDepartures which can be used to to populate the cache for the Kalman prediction method.

You can configure core not to record any data to the database if you do not wish to keep it.

I would keep avlreports and arrivalsdepartures. The other tables relating to vehicle states/events could be cleaned down. I would keep avlreports as the system can regenerate all other data in playback mode if required. Playback uses the avlreports table.

Hope this helps,

Cheers, Sean.

scrudden commented 2 years ago

You can do this type of thing while core is running.

BodoMinea commented 2 years ago

Thank you! That's all I needed to know :)