Open keith-turner opened 1 month ago
This change may support the ability for tablet servers to ask other tablet servers to load a tablet eventually. So when tablet unload for a migration that tablet server could immediately ask the other tablet server to load the tablet.
Is your feature request related to a problem? Please describe.
The Manager maintains a set of migrating tablets in memory and this in memory set tightly couples the tablet group watcher and balancer code. When there is a problem it would be useful to see the contents of this set, logging provides perdiodic information on this but its cumbersome.
Describe the solution you'd like
Remove the in memory set and replace it with a column in the metadata table that is set in the source tablet and has a value of the destination tablet. Hopefully this will avoid the need for the balancer code to pass this set to the tablet group watcher code which in turn passes the set to iterators for filtering. The filtering iterators could see the column in the metadata table and would no longer need to pass this potentially large set around.
This change should also make it easy to see the contents of the set at any time via a metadata table scan.
Describe alternatives you've considered
The in memory set could be kept and one manager process could request it from another via RPC in order to support multiple managers. Placing the set in the metadata table has benefits for filtering and observe-ability that the RPC would not have. Also the metadata approach may be simple to keep consistent.
Additional context
This change is probably a pre-requisite change to multiple managers because it removes an in memory dependencies between different functional components in the manager and moves those from memory to the metadata table. This makes it possible to run those different functional components in different processes.