HWCloudEngine / storage-gateway

storage gateway to adapt cloud storage to replicate/snapshot/backup to other different cloud storage.
7 stars 10 forks source link

SG Volume API: Enable SG/Disable SG/Attach/Detach #5

Open yinweiishere opened 8 years ago

yinweiishere commented 8 years ago

Background: SG is an additional service beyond existed cinder volume types (backends provided by various storage providers) to complete cross clouds DR capability of the existed volumes .

Enable volume with SG: When user chooses to create a volume with its volume type, say lvm, EBS, if the volume type is not capable of backup/replicate/snapshot across cloud, user could enable SG service with this volume (SG service could only be enabled with a volume in 'available' status). Enable SG service on a existed specific volume means take an existed volume under the management of SG. This operation will trigger SG to scan the snapshot of existed volume (optimized with file system) and hook its IO to SG; This operation takes some time for snapshot retrieval, and thus there will be a status transition: available->SG enabling->available; Create a volume together with SG service enabled means create original volume and hook its IO with SG (this will be implemented first); The config of whole SG service will define the destination of replicate/backup; This operation will be covered in creating and attaching;

Disable volume with SG: User could disable SG service on a specific volume, only available volume could apply this disable SG operation. To detach a SG enabled volume, SG will first flush all journals to original volume without more incoming write IO requests. To disable SG on an available volume, it will de-hook/detach the volume from SG;

SG control cmd Enable Disable

yinweiishere commented 7 years ago

Issue: How to attach original volume to SG client?

Background:

Now we have 'enable/disable SG' on cinder service. This is because we need control cinder status during SG service enabling/disabling procedure, say, detach volume should be notified to SG service in case it's enabled there. Same like attach, delete volumes.

In the procedure of 'enable SG', we need attach the original volume to SG client (iscsi mode). The problem here is: cinder service couldn't finish 'attach' to server independently, which means the backend of SG needs interact with Nova to do the attachment.

The other scenario is HA, where we need create server instance (no matter it's SG client or SG server). The arbitrator module needs ask platform to new a instance for it.

Plus, if we have arbitrator to manage volume to SG client/server mapping, where it should be arbitrator to do the 'enable work' (attach original volume to SG client).

Conclusion:

  1. In a word, arbitrator will communicate with platform APIs to attach original volume or create servers with images. To understand different APIs of different platforms, we need prepare different platform drivers (hybrid cloud broker). Those drivers overlap with hybrid cloud.

  2. Tied with hybrid cloud, we don't need prepare drivers by ourselves. Arbitrators will just visit cascading layer keystone to get all service categories;

  3. Separated from hybrid cloud, we need prepare provider dependent plugins like hybrid cloud broker does.

Now, LuoBin is working on approach 2.

yinweiishere commented 7 years ago

Issue: attach volume to SG client during enable SG or during attach?

Attach during enable SG procedure:

Pros: For available status volumes, we can enable backup/snapshot functionality as well, no need attach them to a VM. Consider agent mode SG client, to support this feature we will provide several public clients (or just deploy one per DR server node) for those available volumes attached to SG. But this will cause another problem, when this volume start being attached, we need first detach the volume from the public client and attach it to the actual SG agent.
Another issue is, how could SG driver tells SG client mode? Since for attach volume to VM operation, SG driver will either: 1. get the iscsi target and attach it to guest VM; or 2. detach volume from the public client and attach it to guest VM. This decision should be made with the context of attach SG volume mode. So we should give optional parameter for attach API to indicate the SG attach mode.

Cons: attach volume to SG client will make the volume status changed to attached and thus couldn't be attached during the real attach operation.
Solution is to make the attach aware that it is SG_enabled and thus not update its status to attached. But this solution is limited to hybrid cloud only. Without hybrid cloud, we couldn't control the volume status. In this case, there should be another control layer, where it should be aware the status update issue.

Attach volume to SG client during attach, enable SG only mark on volume attribute

Pros: Status update seems correct.

Cons: Available volumes with SG enabled couldn't be backup/snapshot.

yinweiishere commented 7 years ago

There will be several issues for cinder attach API aware of SG status:

  1. For cinder available but SG enabling status, SG driver will call cascading layer (API) nova attach API to: attach volume to SG client. But at this time, the volume is available for attaching to other servers as well. We have no way to ensure the transaction except to involve Nova aware of SG as well.
  2. For cinder volume attached to SG client already, the volume status is attached. If we want to attach this SG enabled volume to servers, we need either have cinder attach API aware of it or we need a separate SG attach API;
  3. For some storage backends, which itself provides DR config and DR capability, we should NOT mask them.

In this way, we make changes here:

  1. SG will be a separate service while sharing DB table with cinder, which has enable, disable, attach, detach, snapshot, replicate, backup APIs independently;
  2. For issue1, we won't ensure transaction, while we only rolls back from SG enabling to SG disabled if it detects that the volume has been reserved by others. https://github.com/yinweiishere/pictures/blob/master/storage-gateway/control-flow/SG%20attach%20status.png

LB will update the full status diagram here.

yinweiishere commented 7 years ago

will commit rst this week

yinweiishere commented 7 years ago

Implementing API service/client.