apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.07k stars 445 forks source link

Provide Insight into queued compactions #2561

Open milleruntime opened 2 years ago

milleruntime commented 2 years ago

Is your feature request related to a problem? Please describe. When there are lots of compactions running, they get queued up in the tservers? (possibly the CompactionManger) and can only be seen through the Monitor. The only reason an Admin knows that there are compactions queued is in the monitor showing the number in parenthesis.

Describe the solution you'd like Someway for an admin to know the types of the compactions queued and get more information about queued compactions.

Describe alternatives you've considered The Shell provides information about running compactions through the listcompactions command and the fate command. But the fate command will only show user compactions and listcompactions will only show running compactions.

dlmarion commented 2 years ago

IIRC, the thing that is queued is that a tablet needs compacting. The actual details, which files are going to be compacted, are determined later when it's going to be run. Something to keep in mind is that if external compactions are enabled, then a queued compaction on a tserver may not get run on the tserver.

keith-turner commented 2 years ago

Per executor metrics for running and queued are emitted for each tserver. I think that is done by this code. The blog post about external compactions plotted these metrics in the ingest section, in that section look for the compaction queued plot and the description under it. We also have documentation that briefly mentions these metrics exists, however it does not give any information about how to use them. Would these metrics meet your needs? If your are looking for detailed info about what is qeueud (like tablets and files to compact, which subject to change at any time before it runs like Dave mentioned) then would possibly need new APIs to expose this information to the shell.

EdColeman commented 2 years ago

+1 for leveraging metrics. If this information is useful (and I think it is) making it externally available via metrics makes it possible for monitoring, alerting and possibly trending. While it could be useful to have the values available via the shell during troubleshooting and development - operators should not need to use the shell to perform routine monitoring.

ddanielr commented 3 months ago

Somewhat related to #4782