apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.08k stars 445 forks source link

Prefer splitting over compacting #5105

Closed cshannon closed 4 days ago

cshannon commented 4 days ago

If a tablet needs to both split and compact it is better to split first and compact after splitting so that the compaction work can be divided up better

This close #5101

dlmarion commented 4 days ago

I'm curious if it's possible for a tablet to be in the NEEDS_SPLITTING state, but doesn't split for some reason, and then neither splits or compactions occur.

cshannon commented 4 days ago

I'm curious if it's possible for a tablet to be in the NEEDS_SPLITTING state, but doesn't split for some reason, and then neither splits or compactions occur.

I'll let @keith-turner comment too as he would know better but each time TGW runs it should recompute things so if a tablet was marked a needing to be split but then something changed I would think it would remove that marking and it could compact. I'm not sure if there is some other issue that would prevent splitting, maybe if the tablet couldn't be reserved but i'd think that would maybe block compaction too. The only other thing I could think of was if splits were happening so rapidly they blocked compactions but I asked @keith-turner and he didn't think this was a big concern.

dlmarion commented 3 days ago

It looks like the TabletManagementIterator will always return NEEDS_SPLITTING to the TabletGroupWatcher, and the Manager will will only queue up one split task for the Tablet. If something is wrong with Fate, even if there just not enough threads, this task could sit in the queue for a long time and no compactions will run. Also, with the new PR #5104, bulk imports will be paused. The fatePoolWatcher will resize when the MANAGER_FATE_THREADPOOL_SIZE property is changed. I wonder if it should resize also based on the Fate queue size. @keith-turner ?

keith-turner commented 2 days ago

The fatePoolWatcher will resize when the MANAGER_FATE_THREADPOOL_SIZE property is changed. I wonder if it should resize also based on the Fate queue size. @keith-turner ?

We could try to make the the thread pool shink and grow based on demand. However its max size should always be configurable IMO. We can not grow the pool unbounded.