benoitf / custom-repo

1 stars 2 forks source link

a #1

Open benoitf opened 4 years ago

benoitf commented 4 years ago

Is your enhancement related to a problem?

No matter how fast we get to bootstrap a Che workspace, no matter how many external resources we are able to pre-pull (images, extensions, examples source code), we will always need to wait 20+s for a PV to be attached and mounted on the workspace pod.

image

Describe the solution you'd like

image

New Workspace lifecycle:

  1. Workspaces Startup phase
    • Pods (workspace and data sync) are started in parallel
  2. Startup data sync phase:
    • Data flow goes from the persistent volume (rsync server) to the ephemeral volume (rsync client)
    • Containers in the Workspace Pod are started but are not allowed to write in the ephemeral volume
  3. Normal workspace usage phase
    • Containers in the Workspace Pod have full R/W access to the persistent volume
    • Data flow goes from rsync client to rsync server
  4. Workspace Shutdown phase
    • Containers in the Workspace Pod are stopped and data are flushed to the ephemeral volume
  5. Shutdown data sync phase
    • Data are transferred from the ephemeral volume to the persistent volume
    • Pods are destroyed

Workspace components in Read-only mode

In the "Startup data sync phase" the user will already be able to use the editor and plugins but those should behave in a read-only mode until all the data has been synced to the ephemeral volume. That means that Che editors (for example theia) should be able to work on read only mode (initially this can be done by showing a progress bar that shows the data sync and not allowing the user to access theia).

rsync protocol

Rsync is mentioned as the remotes files synchronisation protocol but that’s just an example. If there is a better alternative, let's use it.

Ideas to improve performances (even more)

benoitf commented 4 years ago

Hello,

here are some notes:

about data sync pod:

optimization

One the goal is to be able to start the workspace as fast as possible. For that it means that we create a new workspace (no previous state):

If there is previous data, IDE needs to wait to have project restored before displaying full layout.

Storage synchronization:

optimization: it could cleanup 'unpacked' folder and only keep zip files if files were not used since a lot of time.

theia enhancements :

Another optimization: For now, import/clone of source code is performed when we're entering into the IDE. (it's useful if some 'private' repository is accessed as we may need the github token and have oAuth, etc) but in the case of a public repository, if the project is cloned as soon as possible, it means that we could enter into Theia by having already the project cloned previously or in parallel. So it might speed up again the process. --> needs another Epic just for this specific item.