eclipse-che / che

Kubernetes based Cloud Development Environments for Enterprise Teams
http://eclipse.org/che
Eclipse Public License 2.0
6.99k stars 1.19k forks source link

Build an offline version of the Devfile registry image #14733

Closed l0rd closed 5 years ago

l0rd commented 5 years ago

Is your enhancement related to a problem?

If Che is deployed on a cluster that doesn't have access to github (because behind a firewall or offline) the sample stacks won't work. For that reason we need to build a devfile registry image that contains the samples source code.

Describe the solution you'd like

The projects could be cloned, packaged as zipfiles files and copied in the images at build time. The devfiles should be edited to reference the zipfiles instead of the github repositories. That's similar to what is done for the vsix extensions and the plugin registry.

amisevsk commented 5 years ago

After discussion in standup today, and discussion with @nickboldt prior, I'd like to propose an alternate solution for this issue:

Difficulties with the current solution

The required steps to implementing cached project zip files would be:

  1. Download zips for all projects listed in devfiles (not difficult)
  2. Rewrite all project URLs in a way that makes the cached zips available to workspaces (difficult)
    • This is hard to do. On the plugin registry, caching vsix files is relatively simple as the URL translation logic is stored in the plugin broker; we can specify relative paths and have them resolved at runtime. With the devfile registry, we would have to build these URLs at runtime, since the route to the registry is unknown
    • This also requires modifying the deploy process in all cases, as the deployer for Che would need to get the registry URL and add it as an env var for the deployment.
  3. Ensure the rest of the flow can handle pulling in a zip, unzipping it, and storing it in the right place as a git project (maybe already supported, I haven't investigated much).

Assumptions

Proposal

Instead of building a registry that has project zips cached locally (which then theia pulls and unzips to start the workspace), instead the "offline" build of the devfile registry removes sample projects from workspaces. This allows users to quickly start up workspaces with the tooling they expect, and provides a base into which they can add a project that is relevant to them. Once someone is up and running with Che, with their own project-specific devfiles ready, the devfiles in the registry become less and less useful.

@l0rd @nickboldt WDYT?

nickboldt commented 5 years ago

If you remove the projects (which I like as it cleans up a lot of the problems listed above), could you include a README that loads on startup of the workspace telling a user where they might find a sample project, but that they'd have to import it from zip to get around their firewall restrictions?

Note that "an offline devfile registry" won't include any of the runtime images... so there's still an Ops task to fetch the containers and load them into an airgapped customer's internal registry.

But +1 for removing samples, if we can't have the workspace load and then unpack a zip into /projects/ to create an offline sample project. Can't a devfile invoke shell commands on startup? Could you not unzip sample-project.zip -d /projects/sample-project/ ?

l0rd commented 5 years ago

Samples are important even inAirGap mode. I don't think it makes sense to have stacks with no projects (a workspaces without a project is kind of useless). And re-using one of the sample stacks with a different project does not work in general (commands are not available and build/running can fail). I would postpone Airgap support rather than shipping a devfile registry with no samples. cc @slemeur

It's worth mentioning that this devfile registry image containing the source code may become the default one: it's faster, it works in offline environments and we can safely update the che-sample github repositories without worrying to break an old version of Che.

@amisevsk if the problem is retrieving the internal hostname of the plugin registry from the entrypoint why don't you use HOSTNAME environment variable?

For the support of zip files for projects @akurinnoy has just merged https://github.com/eclipse/che-theia/pull/442 that should solve your problem.

amisevsk commented 5 years ago

I can write the changes no problem, I just don't know how valuable it actually is. I don't think anybody is going to do actual work from the sample projects -- whether a devfile has a project or not, the user will have to add their project and update all commands. I see the sample devfiles as more of a starting point than something that would be used unmodified.

Regarding the sample projects themselves, many of the projects in devfiles currently do not contain a LICENSE file. I'm not sure of the legality of redistributing code with no license attached.

Projects that do not have an license I see:

if the problem is retrieving the internal hostname of the plugin registry from the entrypoint why don't you use HOSTNAME environment variable?

The issue isn't the internal hostname, the issue is that the registry has to update the devfiles to point to its public route/ingress at startup. AFAIK the only way to do that would be to set an env var on the deployment, but this would require us to

  1. Create the route/ingress for the registry.
  2. Get the URL of that public endpoint.
  3. Set the env var on the deployment and create it.

This would require some work on chectl, and would not be supported in raw templates.

amisevsk commented 5 years ago

To add to the above, many sample projects won't be buildable unless we also bundle their dependencies. Even if we can import the spring-boot sample from the devfile registry, at the very least maven will need to be configured to point to your internal maven repository (assuming the dependencies are present there).

l0rd commented 5 years ago

I can write the changes no problem, I just don't know how valuable it actually is. I don't think anybody is going to do actual work from the sample projects -- whether a devfile has a project or not, the user will have to add their project and update all commands. I see the sample devfiles as more of a starting point than something that would be used unmodified.

That's a good point @amisevsk but we are not including the samples for the offline scenario only. That's supposed to become the default for the online scenario as well for the reasons that I have mentioned in the previous comment. As for the offline scenario, is much more comforting for a user starting from a working sample then from a stack that doesn't work when an arbitrary project is added.

Regarding the sample projects themselves, many of the projects in devfiles currently do not contain a LICENSE file. I'm not sure of the legality of redistributing code with no license attached.

That should be addressed as a separate issue. We are already addressing it downstream and I have created a separated issue for upstream: #14790

The issue isn't the internal hostname, the issue is that the registry has to update the devfiles to point to its public route/ingress at startup.

You are right. We need to find a way to get the domain host. That's something that may be available in the /etc/host, /etc/resolv.conf, env variables, config maps, nslookup...I don't know I guess we need to be creative here.

To add to the above, many sample projects won't be buildable unless we also bundle their dependencies. Even if we can import the spring-boot sample from the devfile registry, at the very least maven will need to be configured to point to your internal maven repository (assuming the dependencies are present there).

Another good point. As of today I don't see how we could solve this easily. We can only document how to do configure maven repo mirror (as well as npm, pypi etc...). We need to think if in the future we make those che-server configuration properties, devfile extensions or something else.

benoitf commented 5 years ago

Can we zip the git repository example and host the git/zip as part of the registry as well ?

l0rd commented 5 years ago

@benoitf that's the goal yes.

amisevsk commented 5 years ago

Opened PR https://github.com/eclipse/che-devfile-registry/pull/112

Currently targets a branch until https://github.com/eclipse/che-devfile-registry/pull/110 is merged.

amisevsk commented 5 years ago

We need to find a way to get the domain host. That's something that may be available in the /etc/host, /etc/resolv.conf, env variables, config maps, nslookup

This is a good idea -- I've updated the PR to attempt to resolve the registry IP/port combo from k8s-provisioned env vars if the main override env var is not set.

ibuziuk commented 5 years ago

@amisevsk can we close this one taken into account that there is a separate issue for docs?

amisevsk commented 5 years ago

Yes, this issue can be closed (it's possible to build an offline devfile registry, but we're currently not doing it by default).