Open npawar opened 3 years ago
Can you add some more specific requirements? Are you looking for a no downtime move?
I think we can leverage https://github.com/apache/pinot/issues/5951 to simplify this logic.
Can you add some more specific requirements? Are you looking for a no downtime move? I think we should have a no-downtime solution for this.
There are two types of deep store access patterns:
So there are four scenarios: A. Type I -> Type I: If we want to switch to a big disk instance, then just bring down one controller instance, rsync all the disk from old to new controller instance then update segment metadata download URL with new instance host.
B. Type I -> Type II: If we want to achieve no downtime migration, pinot controller needs to handle the migration from deep store first, so all new segments are using new deep store download uris. meanwhile clients can also download segments from the controller with old download URIs.
The gap here is that we should allow users to configure multiple deep stores and read/write from them. https://github.com/apache/pinot/issues/7302 is a prerequisite.
C. Type II -> Type I: Please think twice and don't do it.
D. Type II -> Type II: This is simple, We can achieve this right now by:
I think overall we need to:
hi @npawar I would like to use this issue as an opportunity to get started with simple contributions. I would like some guidance in getting started. I can see that this issue is only solving a small part of the larger problem of migrating data-stores, and in that spirit I would like to address the usecase of migrating from a local store to a new location supported by PinotFS.
I would like to create a new admin-command "SyncSegmentDeepStoreLocations" which would identify all segments in the cluster/controller whose location does not match the "controller.data.dir" configuration of the controller, and migrate these segments to the correct location - and update the ZK metadata.
My initial thought for the workflow for migration is:
The "SyncSegmentDeepStoreLocations" command itself would:
Does this make sense?
A use case for this is, say someone started with controller local as deep store, and then wants to migrate to using Azure as deep store. If they enable this, the newer segments will use Azure as deep store, but the older segments will still remain on controller local disk. The only way to move the old segments is to manually copy them over to Azure, and then modify the segment zk metadata.
An admin tool for migrating from deep store X to Y would be handy.