Open hiroyuki-sato opened 6 years ago
This is good because similar system commands or internal workflow might be required for this feature https://github.com/treasure-data/digdag/pull/845. To mitigate this issue, we already have option that we could use Postgresql as Digdag backend instead of h2db. What do you think?
@muga Thank you for replying this issue.
I think It is a good mitigation solution to use PostgreSQL instead of H2. Until introduce a new command, a document about deleting session statement may help for PostgreSQL users. (I think nobody knows how to delete Digdag sessions in PostgreSQL)
@hiroyuki-sato Thank you for suggesting. Got it. sounds reasonable. To make such document, we'd better to investigate 1) what tables we want to reduce 2) what order to reduce size of tables. In my case, sessions, session_attempts, task_archives tables are top 3.
@muga I asked digdag users about database usage in Japanese. Twitter
According to the record I can confirm, the task_archives table is 8 to 9 times larger than the other tables, the next large table is session_attempts (2 times larger than other tables). So, it's enough to make a delete statement for task_archives and session_attempts table.
I ended scheduling this sql to delete records older than 1 mont:
DELETE FROM task_archives WHERE id IN (SELECT id FROM session_attempts WHERE created_at < (NOW() - INTERVAL '1 MONTH'));
DELETE FROM session_attempts WHERE created_at < (NOW() - INTERVAL '1 MONTH');
DELETE FROM sessions WHERE last_attempt_created_at < (NOW() - INTERVAL '1 MONTH');
Hello @muga,
may I suggest a digdag command-line option where you can specify a max age for cleaning up old sessions?
For instance:
# delete sessions older than 30 days
$ digdag sessions --cleanup 30d
This way the cleanup would work regardless of the DB backend and would be suitable for cronjobs.
Digdag server uses large database area when running the server for a long time. In this situation, Digdag user wants to delete past sessions. But currently, those feature doesn't implement yet.
An Idea
digdag delete_sessions
command deletes sessions previous the specific datedelete_sessions>
operator delete sessions previous of the specific date.Ref: Twitter