Open xmkg opened 2 years ago
This is an interesting bug report, and I didn't even realise that the unit db kept this data. I wonder why? I can understand keeping data for the main hooks, but update-status seems like the odd one out. I wonder if the solution is to not (by default) log env for update-status?
I wonder if the solution is to not (by default) log env for update-status?
That would also work, but the bug will resurface if other hooks are called often enough. I have no idea why this data is being kept or whether changing the current behavior would break any downstream charms or not. I tried to locate any charms that rely on gethistory()
by brute-force searching on GitHub and OpenDev, but I failed to find any:
It would be great if anybody familiar with this particular part of the code base chime in and enlightens us about the rationale and use cases.
The size of
unit-state.db
is growing too large (>=10GiB) on some deployments, causing out-of-space issues.I've prepared a script to generate a summary of the database, which can be found in the attachments. The script summarizes the
kv_revisions
andhooks
tables and runssqlite3_analyzer
over the database. py-sqlite-analyzer.zip I ran it over a large DB (13GiB), the output is as follows:To summarize the output above; the
kv_revisions
table is taking %99.8 of the space, and theenv
variable revisions account for nearly all the rows in thekv_revisions
table. In this specific environment, the JSON-serialized list of environment variables is nearly ~70KiB in size, and on each hook invocation, this 70 KiB of data is being pushed into the kv_revisions table. So even if the charm is standing idle, theupdate-status
hook will be invoked every 5 minutes, and this solely will produce 7 GiB of data in a year, and this is for asingle
charm. This is obviously bad and causes out-of-disk-space issues for some deployments.I've reviewed the code that produces the revisions, and it is not blindly pushing all environment variables to the database. It checks whether the value has changed before pushing it as a revision. But in the environment variables scenario, some variables like
JUJU_CONTEXT_ID
constantly change on each hook invocation. So,env
gets pushed as a revision no matter what.To eliminate this issue, I have several ideas at hand:
1-) Implement a policy to keep last N revisions only 2-) Limit hooks & kv_revisions table max row count to N rows at most 3-) Exclude update-status from hooks & kv_revisions 4-) Rotate the database periodically (e.g. daily/weekly/monthly), compress & store the old ones 5-) Store only the delta for env 6-) Do not keep revisions for env at all? 7-) ... any other ideas?
Thanks in advance.