pitt-crc / lmod_tracking

Database ingestion for Lmod usage logs
https://crc-pages.pitt.edu/lmod_tracking/
GNU General Public License v3.0
0 stars 0 forks source link

Add ability to fold/compress old data #17

Closed djperrefort closed 1 year ago

djperrefort commented 1 year ago

Request from @pitt-crc/pitt-it:

The current application allows the database to grow without limitation. From a system administration point of view, this is undesirable. It would be preferable to include some kind of functionality to reduce disk usage. Possibilities include deleting, archiving, or compressing older data.

Possible implementation (not a requirement, just a top-of-mind example):

lmod-ingest archive --before [date]

might be used to drop all usage information prior to the given date and instead store the equivalent total number of loads per user, module, and machine.

djperrefort commented 1 year ago

I'm going to close this issue as "will not implement".

It's not clear to me that dropping or "folding" data is the right move. In reality, it will likely depend on the use case. Different teams (or the same team at different points in time) will have different opinions on which subsets of data are useful and how to manage clean up. Implementing that kind of flexibility is a lot of work for something we will likely not use.