eclipse-che / che

Kubernetes based Cloud Development Environments for Enterprise Teams
http://eclipse.org/che
Eclipse Public License 2.0
6.99k stars 1.19k forks source link

Asynchronously attach PVs to Workspaces #15384

Closed l0rd closed 2 years ago

l0rd commented 4 years ago

Is your enhancement related to a problem?

No matter how fast we get to bootstrap a Che workspace, no matter how many external resources we are able to pre-pull (images, extensions, examples source code), we will always need to wait 20+s for a PV to be attached and mounted on the workspace pod.

image

Describe the solution you'd like

image

New Workspace lifecycle:

  1. Workspaces Startup phase
    • Pods (workspace and data sync) are started in parallel
  2. Startup data sync phase:
    • Data flow goes from the persistent volume (rsync server) to the ephemeral volume (rsync client)
    • Containers in the Workspace Pod are started but are not allowed to write in the ephemeral volume
  3. Normal workspace usage phase
    • Containers in the Workspace Pod have full R/W access to the persistent volume
    • Data flow goes from rsync client to rsync server
  4. Workspace Shutdown phase
    • Containers in the Workspace Pod are stopped and data are flushed to the ephemeral volume
  5. Shutdown data sync phase
    • Data are transferred from the ephemeral volume to the persistent volume
    • Pods are destroyed

Workspace components in Read-only mode

In the "Startup data sync phase" the user will already be able to use the editor and plugins but those should behave in a read-only mode until all the data has been synced to the ephemeral volume. That means that Che editors (for example theia) should be able to work on read only mode (initially this can be done by showing a progress bar that shows the data sync and not allowing the user to access theia).

rsync protocol

Rsync is mentioned as the remotes files synchronisation protocol but that’s just an example. If there is a better alternative, let's use it.

Ideas to improve performances (even more)

Florent's edit:

Tasks

amisevsk commented 4 years ago

This also would help in solving issues with e.g. Gluster being too slow for some operations (npm install).

l0rd commented 4 years ago

Another option mentioned by @gorkem is to leverage ephemeral containers introduced in 1.16. That would allow to avoid using rsync.

benoitf commented 4 years ago

Hello,

here are some notes:

about data sync pod:

optimization

One the goal is to be able to start the workspace as fast as possible. For that it means that we create a new workspace (no previous state):

If there is previous data, IDE needs to wait to have project restored before displaying full layout.

Storage synchronization:

optimization: it could cleanup 'unpacked' folder and only keep zip files if files were not used since a lot of time.

theia enhancements :

Another optimization: For now, import/clone of source code is performed when we're entering into the IDE. (it's useful if some 'private' repository is accessed as we may need the github token and have oAuth, etc) but in the case of a public repository, if the project is cloned as soon as possible, it means that we could enter into Theia by having already the project cloned previously or in parallel. So it might speed up again the process. --> needs another Epic just for this specific item.

tsmaeder commented 4 years ago

Just a couple of notes:

  1. From the kube docs:

    Warning: Ephemeral containers are in early alpha state and are not suitable for production clusters.

  2. My .m2 folder is 750MB. I suspect we might have to calculate with large amounts of data.
  3. In order to prevent data loss, we'll have to rsync while developing. Did we measure the impact this has on the performance of development tools (doing theia yarn, for example).
benoitf commented 4 years ago

about 1. we read the docs as well but thx :-)

  1. yes, itt is easy to unpack/transfer big chunks while transfer a lot of small chunks is slow (but here that's an easy fact) (BTW some docker images are bigger than 750MB (even compressed)
  2. you may see different process of syncs. Dummy one would rsync all files at the same scope (user files and generated files) while you may give higher priority to files that are really user modified (source code) vs 'node_modules' folder where even if there is data loss it's less problematic. And we could skip as well all node_modules folder during 'workspace development' and only persist it when closing workspace (here user may accept to loose data but it's like every ignore, a tradeoff)
tsmaeder commented 4 years ago

about 1. we read the docs as well but thx :-)

No doubt, but the fact that we're relying on unproven technology might merit a bit of discussion, no? Is this tech that is ready for our customers?

benoitf commented 4 years ago

@tsmaeder we're not considering for now as it should work on all openshift/kubernetes instances

gazarenkov commented 4 years ago

Do we really need to have per-workspace PV attachment?

Can not we have single Data Sync service/deployment used by all the workspaces instead?

benoitf commented 4 years ago

@gazarenkov it's per user namespace (all workspaces of a user) first.

gazarenkov commented 4 years ago

@benoitf Ok, thanks, then do we really need it per-namespace? :)

benoitf commented 4 years ago

@gazarenkov at first because for example on che.openshift.io you won't be able to mount like a "super big" PV to store all workspaces data (and then how do you manage quota per user as today) and cross-clusters stuff, etc. By using per-namespace at first (but still thinking to allow one service for all users/etc) it remains in the same K8S architecture.

gazarenkov commented 4 years ago

@benoitf What are the limitations for mounting a single big PV in che.openshift.io?

l0rd commented 4 years ago

@benoitf What are the limitations for mounting a single big PV in che.openshift.io?

@gazarenkov that's trickier because the service that does the sync needs to deal with files of different users. That can be implemented as a second iteration though. But let's keep this first iteration simple and implement a PV per user.

gazarenkov commented 4 years ago

@l0rd Looks like we are on the same page regarding direction. If so, I'd suggest reconsider the strategy estimating going to single service at once, because:

So, I'd definitely suggest considering single data service as an option to consider before we go to implementation.

davidfestal commented 4 years ago

Just some thoughts about this issue, the ongoing work on the Workspace CRD, and cloud shell.

According to this EPIC https://github.com/eclipse/che/issues/15425, there will be, at some point, the ability to start Che 7 workspaces in a lightweight, standalone, and embeddable way, without requiring the presence of the Che master (already demoed as a POC).

One important point mentioned in this EPIC, is the big scalability gain that would be brought, in this envisioned K8S-native architecture, by:

In the light of this, I would prefer starting this work with the option that is, as much as possible, compatible with both use-cases:

So it seems to plead for a per-user-namespace solution first. Of course this should not prevent us to extend this solution to use a central service in a second step. But requiring an additional central server to be able to start workspaces seem contrary with the architectural direction we've taken with the DevWorspace CRD and the cloud-shell.

gazarenkov commented 4 years ago

@davidfestal Could you please elaborate about your vision of a layer which persists projects code between user sessions (i.e. temporary) in a light of workspace management decentralization. I.e if in our next system we replace Che server with CRD/controller and Postrge with etcd what does that mean for physical storage for projects? How exactly it related? We are going to replace single (distributed) filesystem (based on Gluster/Seph/EBS/something else) with what?

davidfestal commented 4 years ago

@gazarenkov

Physical storage for workspace data is already per-user (if not per-workspace), through namespaced PVs, and not centralized and common to all the users. I don't see what should change here with the Workspace CRD architecure. Workspace data physical storage is already decentralized. I don't see why it would be required to change the existing way, and now store workspace data in a PV common to all users.

But even without going into all technical details here, my point was to say that requiring an additional centralized service in an architecture that finally should be compatible with workspace management decentralization, seems strange to me.

Afaict, the initial proposal from @benoitf with per-user-namespace storage, would fit the existing and future structure of the Workspace CRD POC.

But sure, a centralized workspace storage service could, at some point, be an optimization option for some use-cases.

gorkem commented 4 years ago

Wouldn't a single big PV require ReadWriteMany access mode?

gazarenkov commented 4 years ago

@gorkem I would guess RWO will work fine for single Data Store Pod, if second (and more) pods spin up - it depends whether scheduler put it on the same node (should work) or different (will not). https://github.com/kubernetes/kubernetes/issues/26567

l0rd commented 4 years ago

@gazarenkov why do you think one central service is simpler? In a centralized service we have to build a secured-to-the-bone mechanism that matches users with folders. And we need to consider scalability as well. A problem with that service and users won't be able to access their data or even worst will have access to data of other users. I don't want to deal with those problems right now.

For the reuse of existing code that's an implementation detail. I would let the team that will work on the code to decide.

gazarenkov commented 4 years ago

@l0rd my guess that it may happen that single service may be simpler based on the fact that we have an experience and working system which used this approach. The only potential problem if we run into PV/K8s infra specific limitations which will not allow us to use it (such as PV size, access mode etc).

I do not think user should have direct access to this data (which is a hot backup of projects), only via Data Sync service which supposedly can scale Pod the same way as usual K8s Deployment ?

I think it may even work w/o this service exactly the same as it does with Ephemeral storage now, i.e. user have access to the instance storage only, syncing this data is exclusively internal mechanism. That's why I do not think this storage should even know who is the owner of particular workspace, it may just deal with folders identified with workspaceId.

Additional bonus of this approach may be a zero PV attaching/mounting time (like ephemeral again).

So, to me, it looks as an option to consider before coding, no?

gorkem commented 4 years ago

About the Central Service

Some other considerations

l0rd commented 3 years ago

@ibuziuk was there something left here or we can close the epic?

che-bot commented 2 years ago

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.