kubevirt / containerized-data-importer

Data Import Service for kubernetes, designed with kubevirt in mind.
Apache License 2.0
413 stars 259 forks source link

Efficient cross namespace PVC cloning #1467

Closed salanki closed 2 years ago

salanki commented 3 years ago

/kind enhancement

Abstract: Current PVC cloning between namespaces copies the data between the PVCs. This is both slow, and a waste of resources. The current in-namespace cloning depends on Volume Snapshots. The snapshot based clone can be replaced by CSI Volume Cloning which will lead to faster clones requiring less ceremony.

The CSI Volume Cloning can also be leveraged to do instant, zero-copy cross-namespace volume cloning.

Implementation: The cross-namespace cloning can be achieved in a semi hacky way using the following process:

Assume the cloned PVC is A and the target is a new PVC named C in a different namespacae.

To ensure security, apply the same permission checks as for the current copy based cross-namespace volume cloning. This method requires storageClass to be the same. If they differ, fall back to the copy method.

mhenriks commented 3 years ago

Our intention has been to leverage a community implementation of PVC namespace transfer to implement this feature. But there has not been a lot of traction (partially my fault for not pushing hard enough). Here is the latest KEP: https://github.com/kubernetes/enhancements/pull/2110

ghost commented 3 years ago

Isn't it possible to simply delete the PVC in the source namespace, set the claimRef on the PV to the future PVC in the destination namespace and then create the new PVC ?

There might be a race condition during deletion of the PVC and setting the claimRef, but there may be an annotation that can be set before deleting the PVC and that would make the PV unavailable for other PVCs.

ghost commented 3 years ago

Nevermind my comment. I should have read the referenced KEP. What is the plan to get the KEP accepted?

salanki commented 3 years ago

Isn't it possible to simply delete the PVC in the source namespace, set the claimRef on the PV to the future PVC in the destination namespace and then create the new PVC ?

This is pretty much what we do in #1552.

ghost commented 3 years ago

It's not that obvious from the doc in the PR. But, I probably lack some terminology to fully grasp the content :)

brybacki commented 3 years ago

As discussed on the PR #1552, there were some bits missing to approve and merge it. The idea and the implementation was actually good, but it conflicted with some more generic requirements.

As of now, the "Cluster scoped DataVolume/PVC namespace transfer API" #1673 is merged, and also the CEPH in the CI was updated (https://github.com/kubevirt/kubevirtci/pull/562), so we can go back to the 1552 and implement the efficient clone.

Part of the original PR is already implemented in a different way (ns transfer), and there are merge conflicts now, so it would require more work and a new clean PR.

@salanki , @YitzyD: I noticed you were really interested in moving things forward, and I suppose you already use this solution in production. We now have the capacity to work on the task and finalize it. Are you still working on some version of #1552? Are you OK if I take the task and adapt the code to use the new #1673?

salanki commented 3 years ago

We are currently using this PR in production successfully. As much as we would like to refactor #1552 to use #1673, we don't have the bandwidth to do it ourselves at this time, so please go ahead and take a crack at it. We are here to answer any questions you might have.

brybacki commented 3 years ago

That is great, I am on it.

brybacki commented 3 years ago

In progress: 1824 This now works with CDI - Size Expansion and Namespace Transfer, PR is not yet finished, now working on some last bugs, cleanup and documentation.

brybacki commented 3 years ago

@salanki , @YitzyD: feel free to look and comment on https://github.com/kubevirt/containerized-data-importer/pull/1824

I have added a full test suite (reusing most of existing clone tests), and I still need to update the documentation.

The only functional differences I can think of is are:

t3hmrman commented 2 years ago

Is there any documentation/discussion around what the default is set to (seems to be snapshot)? I recently had a VM get stuck waiting for a DV to be cloned via snapshot, and was able to clear it only by finding the PR merge commit, and manually editing my StorageProfile.

For storage I happen to be using OpenEBS ZFS LocalPV which supports snapshots & clones but with a caveat -- they're same node and same namespace.

It would be great if there was a way to set the default auto-created strategy that CDI will create (so I can choose copy and take the performance hit but have things work) -- at present from what I can tell I must edit the auto-created StorageProfiles.

brybacki commented 2 years ago

The doc about storageprofile[1] (also linked from csi-cloning[2]) states:

The value for cloneStrategy ca be one of: copy,snapshot,csi-clone. When the value is not specified the CDI will try to use the snapshot if possible otherwise it falls back to copy. If the storage class (and its provider) is capable of doing CSI Volume Clone then the user may choose csi-clone as a preferred clone method.

We are open for suggestions if you think it might be more verbose about that, or organize cloning description better.

What you propose the default auto-created strategy that CDI will create also sounds interesting. We need to think about that.

  1. https://github.com/kubevirt/containerized-data-importer/blob/main/doc/storageprofile.md
  2. https://github.com/kubevirt/containerized-data-importer/blob/main/doc/csi-cloning.md
t3hmrman commented 2 years ago

We are open for suggestions if you think it might be more verbose about that, or organize cloning description better.

Hey @brybacki thanks for the pointer, that's plenty clear -- I didn't read that section closely enough.

What you propose the default auto-created strategy that CDI will create also sounds interesting. We need to think about that.

The documentation was also pretty clear about being able to change them after the fact, and that is a solid work around (most may not have this particular problem), but I'd love it if there was a way to pre-specify the strategies that are put into the StorageProfiles, thanks for considering it.

brybacki commented 2 years ago

@t3hmrman There is a PR in progress that might address some of your problems: https://github.com/kubevirt/containerized-data-importer/pull/2004

t3hmrman commented 2 years ago

Thanks for the heads up @brybacki , that looks like it will help for the cross namespace issue!

kubevirt-bot commented 2 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot commented 2 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot commented 2 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot commented 2 years ago

@kubevirt-bot: Closing this issue.

In response to [this](https://github.com/kubevirt/containerized-data-importer/issues/1467#issuecomment-1092800012): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.