apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.07k stars 444 forks source link

Investigate how to handle examining files as part of compaction planning #3526

Open keith-turner opened 1 year ago

keith-turner commented 1 year ago

As part of selecting files to compact for user compactions the compaction selector plugin offered examining summaries and sample data as part of this decision. This data supported use cases like the TooManyDeletesSelector which could trigger a full tablet compaction when the ratio of deletes in a tablet exceeded a certain threshold.

When the examination ran in tablet servers this examination had much more CPU and memory available than what will be available when running the examination in the manager. In #3513 which moved user compaction from tablet server to the manager this functionality was not implemented because of concerns about resources.

keith-turner commented 12 months ago

One possible way to solve this is via #3822 and #3559 which would allow fate operations that do this examination to run in multiple processes.