eclipse-archived / codewind

The official repository of the Eclipse Codewind project
https://codewind.dev
Eclipse Public License 2.0
114 stars 44 forks source link

Implement a Codewind operator #1598

Closed johnmcollier closed 4 years ago

johnmcollier commented 4 years ago

So I've been thinking over the differences in how chectl deploys Che and how cwctl deploys remote Codewind on Kube, and the big difference between the two is that chectl is really just a wrapper for the Che operator, rather than dealing with deployment/upgrades/deletion directly. So it got me thinking, would we want to write an operator to manage the deployment of remote Codewind? There's a few advantages (in my opinion):

I know @markcor11 has had the same thoughts as I have. What are other people's thoughts? Is this something worth exploring from our end?

hhellyer commented 4 years ago

/assign @markcor11

jgwest commented 4 years ago

@hhellyer: GitHub didn't allow me to assign the following users: markcor11

Note that only org members, repo collaborators and people who have commented on this issue/PR can be assigned.

markcor11 commented 4 years ago

Will look into this

markcor11 commented 4 years ago

/assign @markcor11

johnmcollier commented 4 years ago

Removing the iterative dev label since the Portal squad is driving this.

Also removing the tech-topics label since we no longer need one ;)

markcor11 commented 4 years ago

Added design doc : https://github.com/codewind-resources/design-documentation/blob/master/codewindServer/CodewindOperator.md

markcor11 commented 4 years ago

Work in progress, have Codewind,Performance,Gatekeeper and Keycloak now deployed via an operator :

$ kubectl apply -f deploy/crds/codewind.eclipse.org_v1alpha1_keycloak_cr.yaml
keycloak.codewind.eclipse.org/codewind-keycloak-k3a237fj created

$ codewind-operator git:(master) kubectl get keycloaks
NAME                         DEPLOYMENT   NAMESPACE   AGE   ACCESS
codewind-keycloak-k3a237fj   devex-0001   codewind    37s   https://codewind-keycloak-k3a237fj.10.100.111.145.nip.io

Installing a Codewind deployment :

$ kubectl apply -f deploy/crds/codewind.eclipse.org_v1alpha1_codewind_cr.yaml  
codewind.codewind.eclipse.org/codewind-k81235kj created
$ kubectl get codewinds
NAME                USERNAME      NAMESPACE   AGE   AUTH         ACCESSURL
codewind-k81235kj   cody-sprint   codewind    53s   devex-0001   https://codewind-gatekeeper-k81235kj.10.100.111.145.nip.io

The operator will currently create the Deployments, Services, Ingress, PVC, Secrets with owner/garbage collection set to the codewind custom resource and reconcile any changes.

Next to do :

configure Keycloak with user and Codewind deployment

markcor11 commented 4 years ago

I now have the operator deploying Keycloak and Codewind. It can :

Example Keycloak YAML :

apiVersion: codewind.eclipse.org/v1alpha1
kind: Keycloak
metadata:
  name: devex001
  namespace: codewind
spec:
  storageSize: 1Gi

Example Codewind YAML :

apiVersion: codewind.eclipse.org/v1alpha1
kind: Codewind
metadata:
  name: jane1
  namespace: codewind
spec:
  keycloakDeployment: devex001
  username: jane
  logLevel: info
  storageSize: 10Gi

Both VSCode and Eclipse are able to connect to an operator monitored Codewind deployment and they are both able to create projects and run them (including access the performance dashboard)

So that's the core functionality done, next is to verify functionality on some alternative platforms:

markcor11 commented 4 years ago

Whilst testing in Redhat CodeReady Containers had to make a couple of changes to accommodate openshift routes.

  1. Registered the OpenShift Route V1 api to the operator manager scheme
  2. Fixed an old issue we have in CWCTL that required having to grant "privileged" security context constraints using the oc adm policy add-scc-to-group privileged. This manual step is for operator deployments is no longer necessary since I've added this resource to the cluster roles directly which have role bindings to the service account for the Codewind deployment.

As a result of those changes the operator is now able to deploy to Redhat CodeReady Containers and Kubernetes v1.15.5 on Docker

markcor11 commented 4 years ago

Deploying to an Openshift cluster 3.11 had a problem because the generated CRD have a type:object that is not recognised by Openshift 3.11 api. That block deployment of the Keycloak and Codewind CRD.yamls however it can be worked around by removing that generic type. Doing that lets the CDR deploy ok and also allowed the IDEs to connect, create and run projects fine. I've added a TODO to look into this but since we primarily want to support the latest version of Openshift (V4) this should be ok.

markcor11 commented 4 years ago

I setup a new Redhat 3.11 Openshift cluster on IBM Public Cloud - no problems here, correct storage classes were configured by the operator :

Screenshot 2020-03-19 at 00 59 02

Connected, built and launched a project from the IDE - worked fine

markcor11 commented 4 years ago

In Kubernetes v1.15.10+IKS Public Cloud - Deployment of Codewind was successful. VSCode plugin was able to connect, authenticate, create a project. Codewind built the image but failed to start any project. (go & node) 👎

 message: projectStatusChanged
 data: {
  "projectID": "1e1c80a0-6986-11ea-9814-f507a0f8c0fd",
  "appStatus": "stopped",
  "detailedAppStatus": {
    "severity": "ERROR",
    "message": " pod for helm release cw-mynodetest3-1e1c80a0-6986-11ea-9814 failed to start",
    "notify": false,
    "notificationID": ""
  }
johnmcollier commented 4 years ago

@markcor11 that’s expected with IKS with Kube 1.15. The version of containerd it uses is incompatible with buildah. Move up to Kube 1.16 or higher and things are fixed.

See https://github.com/eclipse/codewind/issues/2251

markcor11 commented 4 years ago

Thanks @johnmcollier upgraded the cluster to 1.16.8 and was able to connect / build and launch projects ok IKS.

Screenshot 2020-03-19 at 10 02 15

Left to do - Openshift 4

markcor11 commented 4 years ago

ROKS 4.3 cluster on IBM Public Cloud - no problems building and running a project from IDE : OpenShift Version 4.3.1, Kubernetes Version v1.16.2.

Screenshot 2020-03-19 at 11 55 15

So that should be everything tested and working, deploying everything with the same set of steps. We don't have a repo in eclipse for this yet but can work on the docs and write some tests in the mean time

markcor11 commented 4 years ago

Operator code is now hosted in its own repo @ http://github.com/eclipse/codewind-operator which includes install instructions, an install.sh, custom resource definitions and the operator code itself.

/pipeline verify

codewind-bot commented 4 years ago

@johnmcollier - this issue is now ready to be verified.