redhat-developer / opencompose

OpenCompose - A higher level abstraction for Kubernetes Resource
Apache License 2.0
64 stars 12 forks source link

experimental new `App` resource kind #138

Open jstrachan opened 7 years ago

jstrachan commented 7 years ago

So I've noodled the opencompose examples and I love their concise nature. What worries me is leaky abstractions and keeping in sync with all the detail in kubernetes and keeping in sync with the kubernetes resources which keep changing over time. e.g. from a quick look at the issue tracker there's already a fair few gaps: #134, #130, #128, #126, #125, #124, #123, #122, #120, #114, #101, #93, #91, #90, #85, #81, #24

As a thought experiment I had a little play with adding something opencompose-ish to the fabric8-maven-plugin. Here's a link to the issue with some example markup and what it generates along with the design rational.

Basically I tried to adding the smallest possible new resource kind while trying to be nice and concise but letting the real kubernetes structs shine through so that there's no leaky abstraction and minimal learning required for kubernetes savvy folks. It turns out I've a fix for these issues too ;) #134, #130, #128, #126, #125, #124, #123, #122, #120, #114, #101, #93, #91, #90, #85, #81, #24

Basically I:

Example

here's a sample app.yml file using the App resource:

---
name: "cheese"
labels:
  app: "cheese"
annotations:
  com.acme: "foo"
replicas: 1
containers:
- image: "foo/bar:123"
  env:
  - name: "MY_ENV"
    value: "CHEESE"
  volumeMounts:
  - mountPath: "/app/logs"
    name: "log-data"
  - mountPath: "/app/appstuff"
    name: "app-data"
service:
  ports:
  - port: 80
    targetPort: 8080
configData:
  application.properties: "foo.bar = abcd"
persistentVolumes:
  log-data:
    accessModes:
    - "ReadWriteOnce"
    resources:
      limits:
        storage: "1Gi"
  app-data:
    accessModes:
    - "ReadWriteMany"
    resources:
      limits:
        storage: "5Gi"

Generated these resources:

---
apiVersion: "v1"
kind: "List"
items:
- apiVersion: "v1"
  kind: "Service"
  metadata:
    annotations:
      com.acme: "foo"
    finalizers: []
    labels:
      app: "cheese"
    name: "cheese"
    ownerReferences: []
  spec:
    deprecatedPublicIPs: []
    externalIPs: []
    loadBalancerSourceRanges: []
    ports:
    - port: 80
      targetPort: 8080
    selector:
      app: "cheese"
- apiVersion: "v1"
  kind: "PersistentVolumeClaim"
  metadata:
    annotations:
      com.acme: "foo"
    finalizers: []
    labels:
      app: "cheese"
    name: "app-data"
    ownerReferences: []
  spec:
    accessModes:
    - "ReadWriteMany"
    resources:
      limits:
        storage: "5Gi"
      requests: {}
- apiVersion: "v1"
  kind: "PersistentVolumeClaim"
  metadata:
    annotations:
      com.acme: "foo"
    finalizers: []
    labels:
      app: "cheese"
    name: "log-data"
    ownerReferences: []
  spec:
    accessModes:
    - "ReadWriteOnce"
    resources:
      limits:
        storage: "1Gi"
      requests: {}
- apiVersion: "v1"
  kind: "ConfigMap"
  metadata:
    annotations:
      com.acme: "foo"
    finalizers: []
    labels:
      app: "cheese"
    name: "cheese"
    ownerReferences: []
  data:
    application.properties: "foo.bar = abcd"
- apiVersion: "extensions/v1beta1"
  kind: "Deployment"
  metadata:
    annotations:
      com.acme: "foo"
    finalizers: []
    labels:
      app: "cheese"
    name: "cheese"
    ownerReferences: []
  spec:
    paused: false
    replicas: 1
    selector:
      matchExpressions: []
      matchLabels:
        app: "cheese"
    template:
      metadata:
        annotations:
          com.acme: "foo"
        finalizers: []
        labels:
          app: "cheese"
        name: "cheese"
        ownerReferences: []
      spec:
        containers:
        - args: []
          command: []
          env:
          - name: "MY_ENV"
            value: "CHEESE"
          image: "foo/bar:123"
          ports: []
          volumeMounts:
          - mountPath: "/app/logs"
            name: "log-data"
          - mountPath: "/app/appstuff"
            name: "app-data"
        imagePullSecrets: []
        nodeSelector: {}
        volumes:
        - name: "log-data"
          persistentVolumeClaim:
            claimName: "log-data"
        - name: "app-data"
          persistentVolumeClaim:
            claimName: "app-data"

Comparing App to opencompose

Whats interesting is its very similar really to opencompose in feel, conciseness and markup. The main difference is the header part; opencompose looks more like docker compose at the top (the version: and services:).

We already have kcompose for migrating docker compose to kubernetes so I'd rather opencompose focus on whats best for kubernetes; using its conventions.

Though it'd be trivial to define an AppList type resource to be a collection of Apps and be more like opencompose; though the kubernetes default for lists of structs is not to use maps like:

services:
  foo:
     cheese: edam

but to use a list

kind: AppList
items:
- name: foo
  cheese: edam

so if I were re-inventing the 2.0 of opencompose I'd probably ditch this docker compose convention anyways?

Other than that things are fairly similar really; the main difference is the focus on reusing the Spec structs from kubernetes and composing them together in a concise way rather than inventing a brand new schema.

Interestingly on the services: versus AppList front; with the focus on microservices (one app that does one thing well), I think its going to be easiest for developers to have an app YAML file for each app they work on. Typically the default case will be 1 app.yml per microservice (i.e. per git repo) anyway.

A typical docker compose example often shows a front end and a database; which is great for demos. However in the real world usually the database is owned by one team and provisioned and upgraded on a totally different schedule to the front end app anyway; in which case having 2 separate app.yml files for them makes loads more sense really.

Plus developers find it hard to grok large nested yaml files anyway ;) so I do actually prefer the default of keeping apps in different files.

Closing thoughts

Note that this was just a thought experiment really. I was really just thinking at the time of Java folks and reducing their typing when hacking YAML in some validating/completing IDE. I'm still mulling over how this may or may not affect opencompose really.

One interesting further thought is with the new API Groups in kubernetes we can add new schemas to kubernetes along with controllers/operators that work on them. So we could create a new App resource (maybe even upstream it to kubernetes itself - or worst case make an API Group and controller for it) so that folks could use kubectl directly:

$ kubectl get apps
$ kubectl apply -f myapp.yml

etc which then whenever an App is created/updated/deleted it runs the canonical generator to make the resulting Deployment, Service, ConfigMap, PVCs etc and apply those chanages.

i.e. then folks could work at this App abstraction as a simpler way of working with the actual underlying kubernetes abstractions.

If we did go in this direction though, we'd maybe wanna make a minor change to App and move the metadata fields out, rename App to AppSpec and then add a new App class/struct containingMetadata and AppSpec so that it follows the conventions of all the other kubernetes resources. Then we'd add AppList and then the kubectl / REST API would be consistent.

jstrachan commented 7 years ago

Incidentally the java code is all code generated from the underlying kubernetes source code so these Java classes keep in sync with the kubernetes structs; this is obviously much easier to do in golang as we can then just reuse the kubernetes Spec structs directly ;)

So over the time this model should keep pace with the underlying changes in the kubernetes model nice. The only thing we need to be careful of is any conflicts between merging the DeploymentSpec properties and the PodSpec properties, along with the extensions, into the same struct- which hopefully will be fine

pradeepto commented 7 years ago

Note : Data Transfer Object == DTO aka struct in Go language terms.

jstrachan commented 7 years ago

thanks @pradeepto - I stuck to referring to them as struct in the above to avoid more confusion with other opencomposers

jstrachan commented 7 years ago

incidentally another argument against the services: header longer term for opencompose is that services are assumed to be kubernetes Service resources in the kubernetes ecosystem; whereas opencompose is more about making apps / microservices

kadel commented 7 years ago

Hi @jstrachan It looks interesting and I like the idea of "letting the real kubernetes structs shine through".

With this approach, especially with having 1 app.yaml file for every app (separate file for db and application), how would you handle things that are shared between applications? For example, Secrets that are defined once, but they can be used by multiple application.

jstrachan commented 7 years ago

So the first thing is Secrets should never be included in your applications distributed YAML that gets moved between dev / test / staging / production; they should always be curated on a per environment basis separately from the app. e.g. in production Secrets will be generated/managed/rotated by operations folks rather than the developer who wrote the microservice.

So today in the app.yml's container: you can expose secrets as environment variables or mount entries as files using the volume mounts in the usual kubernetes way - you just reference the secret by name (without including or generating the Secret resource).

      containers:
      - image: foo
        env:
        - name: MY_PASSWORD
          valueFrom:
            secretKeyRef:
              name: somesecret
              key: user.password

Referring to Volumes/Secrets/ConfigMaps by name describes a static reference that operational tools in each environment can check they exist before you try to install an app - but allow things to be configured separate in each environment.

One thing thats currently missing in kubernetes is a SecretClaim resource, rather like a PVC claims a PV it would be nice to include a SecretClaim to describe the name and metadata you need (e.g. what kind of key/cert) so that tools can auto-generate them or import them from known locations like Vault etc. We've used annotations in the past for this https://github.com/fabric8io/fabric8/blob/master/docs/secretAnnotations.md - one day it'd be nice to separate that out into a separate API Group / ThirdPartyResource along with a controller/operator that could automatically generate random secrets for things (or import secrets from known locations on a per environment basis).

Though I think the common case for PVCs and ConfigMaps is for them to be included in the app really as the App often owns them (its the PVs that are never included and handled by Operations). Though sometimes you may wish to reference a shared ConfigMap by name and not have it included in your app (so that it can be managed by other people, another team or just a different release process) - or have shared ConfigMaps owned by operators and not distributed with your app.

e.g. shared ConfigMaps could be managed via one git repo with its own release cycle; then each app having its own git repo (and maybe using branches or separate repos for each environment like testing/production) and CI / CD pipeline to roll out changes.

For example with openshift.io right now each app just releases its Service + Deployment (or DeploymentConfig); then operations take care of maintaining all the secrets; then our CI / CD pipelines roll out new apps (new images + YAML files of Services/Deployments/DeploymentConfigs/ConfigMaps/PVCs) to each environment.

Interestingly we'd like Routes to not be included in a YAML either; as its common to want those to be enriched with an environment specific certificate for HTTPS support; so thats something better done by a microservice/controller/operator than added to your YAML. e.g. see exposecontroller that can generate Routes / Ingress / LoadBalancer / NodePort for your Service

jstrachan commented 7 years ago

Longer term I'm expecting Databases and Message Queues and whatnot to not be specified as an app.yml but specified using the Service Catalog so that your app.yml just references known Service resources by name and may include some Service Catalog resources such as an Instance to help the brokers provision the Services you need (which typically create a Service and Secret for your app to use).

i.e. referring to Services and Secrets is gonna be increasingly common. The same would be true if Team A is using a service provided by Team B and vice versa; each team would have their own YAML and refer to each others services by name.

We should try ensure that the YAML focusses on capturing a single teams responsibilities; if 2 teams need to release YAML at different schedules and times then its better to separate those resources into separate YAML files in different git repos

surajssd commented 7 years ago

I like the consice way of defining the application using k8s structs, where we are not reinventing things.

Also the point you made about the user caring only about their application and not about anything else is a valid one.

But I think the tool might not be able to give good user experience if we assume things to be already present, like secrets. So how does a user who want to define secret use opencompose to do it?

Also since we are assuming things are present already, user will find out those things only while deploying. So we are loosely coupling the dependencies.

If you see the configs that are generated right now, specially with docker-compose or even opencompose it acts as single point of truth, and once converted it can be deployed on any cluster. No pre-reqs are assumed than a normal running cluster.

But assuming things to be present in the cluster and just referring from opencompose adds new set of issues. Since opencompose is no more single point of truth but something that has invisible dependencies. Maybe the tool could become intelligent and query cluster while deploying things that are being referred to be present on the cluster before hand.

I agree on the point where you say that in production grade deployments teams will be working on different pieces and then they will write their own configurations independently and when we add those together it will be deployed properly.

concaf commented 7 years ago

@jstrachan -

Hey James, thanks for bringing this up, this has opened room for a much broader view and discussions.

Thoughts -

  • added the properties from DeploymentSpec other than the template which mostly just is the PodSpec anyways
  • added a service property for defining a service for the app (which uses ServiceSpec)
  • added an optional ConfigMap data for easy adding of configuration data to an app
  • added optional extra PVCs we could do the same with emptyDirs too I guess - its easy to also mount external volumes/secrets/configmaps already with the current App as its a PodSpec already; the persistentVolumes property is more for defining the actual PVCs that are owned by this App

If we are modifying the pod spec to add other primitives, then it's not a standard Kubernetes spec anymore, right? This would mean that we have already modified the pod spec to include other primitives.

So, the way we have been writing the OpenCompose spec has been with keeping the developer in mind and the most common directives they would use, and hence adding those to the spec. Till now, it has turned out to look like a lot like a pod spec :) , but we are not limited by the pod spec and the way Kubernetes defines things.

One example would be how we are moving forward with secrets.

This is how a snippet from the OpenCompose file would look like -

services:
- name: foobar
  containers:
  - image: foo/bar:tag
    env:
    - name: foo
      value: bar
    - name: travis_creds
      secretRef: ci_secret/travis

secrets:
 - name: ci_secret
   data:
   - key: travis
     file: /tmp/travis.password

Here, we are referring to the root level secrets using the secret_name/data_name syntax. This makes the spec a lot more concise, more intuitive, and yes, different from the pod spec. Also, at the root level secrets, we also allow taking in input from a file, which is not a part of the Kubernetes Secret object definition.

Now, this might not be the best example or syntax to move forward with, but if we move forward with extending the Kubernetes spec as is, then we lose out on the angle of "making defining microservices simpler", and we move to "making Kubernetes configurations simpler to write". We have been doing things the former way till now, but both have their downsides and tradeoffs which we should consider.

I think its going to be easiest for developers to have an app YAML file for each app they work on. Typically the default case will be 1 app.yml per microservice (i.e. per git repo) anyway.

We could push this as a "best practice", but imo the developers should have the ability to choose if multiple microservices are defined in one file or multiple. A way to merge those would be awesome, but I'm not sure about enforcing this.

incidentally another argument against the services: header longer term for opencompose is that services are assumed to be kubernetes Service resources in the kubernetes ecosystem; whereas opencompose is more about making apps / microservices

Yep, the naming can be confusing here, we can change that. But I think this root level directive serves the purpose of having multiple microservices defined in one file, and even is decoupled teams, it's very likely that a single team owns multiple services which are coupled together - and it would be great to define them in one file.

So today in the app.yml's container: you can expose secrets as environment variables or mount entries as files using the volume mounts in the usual kubernetes way - you just reference the secret by name (without including or generating the Secret resource).

Yep, but at the same time, a developer should be able to quickly define a secret in (a separate?) OpenCompose file, so in their developer environment, they don't have to interact with the cluster. Their OpenCompose file will be all they need to deploy their application, and if they do not want to define a secret in the OpenCompose file, the tooling should log that "the cluster must contain this resource, since you didn't define it in OpenCompose"

We should try ensure that the YAML focusses on capturing a single teams responsibilities; if 2 teams need to release YAML at different schedules and times then its better to separate those resources into separate YAML files in different git repos

I really like the idea of multiple OpenCompose files coming together and merging into one application, but at the same time it should be left to the teams to decide how they want to do it, we should be providing a "multiple files merging" and "one single file" for the users to choose from.


I really like the idea of "app", and merging files, and borrowing more from the Kubernetes spec, it'd be great if both the approaches converge to something more awesome :)

Would love to hear your thoughts about the above mentioned points.

jstrachan commented 7 years ago

@surajssd I think opencompose should focus on helping application developers make apps in the correct way for kubernetes, as immutable blobs of YAML that are then used to deploy apps into any environment/namespace/cluster of kubernetes. That doesn't mean it has to solve all problems. e.g. opencompose should not create PVs - thats something for Ops folks to setup on a per cluster basis. Similarly secrets are something that are typically provisioned & rotated out of band of application YAML. A typical app may be deployed to multiple environments and clusters; each with possibly different secrets but using the exact same application YAML.

So while we could include Secrets inside the app.yml its much more helpful to nudge application developers down the happy path of just providing the stuff that application developers should provide - and omit the stuff that the cluster Operations team handle (PVs and Secrets).

However upstream in kubernetes I'd love a SecretClaim resource which we could include inside the app.yml like we do for PVCs; thats kinda like the PersistentVolumeClaim resource - to indicate to an ops person/program/cluster that there is a need for a Secret (like a PV) but not actually including the Secret - just providing the metadata to describe what kind of secret is required.

There's no real way around this loose coupling really. Its exactly the same with PVCs; you can deploy your app and if a cluster has not got suitable PVs setup (or automatic PV provisioning) then the PVCs won't bind. Ditto if you depend on a separate Service implementation which is managed by a separate microservice team. This loose coupling is going to be more and more common with the service catalog work.

However there's nothing to stop opencompose writing a tool to check if an app.yml has all the associated resources (PVs / Secrets) to be able to be created.

If you want opencompose to include everything - that also implies including PVs right? It also means you have to reject ever working with the service catalog? Also that implies checking your Secret into source control that sounds like a bad idea.

There's nothing stopping folks adding a separate secrets.yml file with some default secrets; though we should really try discourage that approach.

Incidentally for simple secrets (login/passwords) its common to use an OpenShift Template with parameter values for the user/pwd values which can be auto-generated at template instantiation time. That only works for simple user/pwd fields though; it doesn't much help for certs or SSH public/private keys or whatnot.

But really secret management is something we should tackle upstream in the kubernetes ecosystem via something like a SecretClaim

jstrachan commented 7 years ago

@containscafeine I'm not suggesting we modify any structs / types from kubernetes - I'm suggesting we extend those structs to add more fields. Thats easy to do in golang via struct extension or java (using base classes) - which is what I did in the Java implementation via:

public class App extends PodSpec {

Just like you've added your own properties which are not in PodSpec I've done the same in App - the difference is I've extended PodSpec and reused ServiceSpec, DeploymentStrategy, PersistentVolumeClaimSpec and reused the same properties as DeploymentSpec so that its all very consistent with kubernetes - without changing any kubernetes structs at all

I think its going to be easiest for developers to have an app YAML file for each app they work on. Typically the default case will be 1 app.yml per microservice (i.e. per git repo) anyway.

We could push this as a "best practice", but imo the developers should have the ability to choose if multiple microservices are defined in one file or multiple. A way to merge those would be awesome, but I'm not sure about enforcing this.

I'm not against allowing multiple apps to be defined in the same file - though I think it makes the developer UX much worse - but its also trivial to generate the resulting generated YAML as a single YAML to simplify installing it via kubectl / oc. Though if there was a YAML for multiple apps I'd prefer it used the kubernetes conventions rather than docker compose conventions (lists of objects).

In response to the rest of your comments on Secrets (I've not quoted it all here as its hard in github comments ;-) I'm not against having some clearly defined default secrets YAML file thats only used for developers trying things out but is never used for a real pre-production or production deployment. However that feels to me more like something we'd keep in a separate file so that you can omit the default secrets when deploying your app to pre-prod/prod.

If we had a SecretClaim resource which could supply dummy default values but those values would be omitted from real deployments - that would maybe solve all of our issues; we'd be able to put SecretClaim resources OOTB in all app.yml files like we do with PVCs; then they'd have some dummy insecure defaults that get everyone started in development (e.g. using admin/admin login/pwds and stuff) but then we'd have tools that could lookup or generate new secrets in other environments like we've done in the past in fabric8 - then its a win-win - we know what secrets need to be populated in real environments but in developer experimental environments we can get going fast. (To be clear we used to use the presence of these secret annotations to generate or load secrets from disk as applications get deployed - to avoid hard coding secrets inside application YAML files. We could do something similar if we added a SecretClaim resource (that it'd be nice to upstream into kubernetes really).

e.g. I can imagine a controller in kubernetes using the presence of SecretClaim resources to then provision Secret resources by looking things up in tools like Vault

pradeepto commented 7 years ago

Similarly secrets are something that are typically provisioned & rotated out of band of application YAML. A typical app may be deployed to multiple environments and clusters; each with possibly different secrets but using the exact same application YAML.

This topic is being discussed here: Issue - https://github.com/redhat-developer/opencompose/issues/47 and Proposal - https://github.com/redhat-developer/opencompose/pull/149

concaf commented 7 years ago

After thinking about it for some time, I think that extending PodSpec instead of writing our own similar looking spec makes sense because -

At the same time, we should -

Implementation

We can embed the PodSpec like -

type App struct {
    api_v1.PodSpec
    field1   int
    field2   bool
}

Here, we lose out on the ability to modify/remove any fields in the PodSpec, but we can only append something to the spec (like a ConfigMap).

I don't like this. This gives us partial control over the spec.

I don't like not having full control on the spec that we are writing.

Why? For instance, I will be bound to define an environment variable coming from a secret like -

- name: foobar
  containers:
  - name: foo
    image: foo/bar:tag
    env:
    - name: foo
      value: bar
    - name: travis_creds
      valueFrom:
        secretKeyRef:
          name: ci_secret
          key: travis

This increases 2 indentation levels and introduces 3 more fields and adds 3 lines for each such declaration, when compared to the current secrets proposal at #105, which proposes to specify the same like -

- name: foobar
  containers:
  - name: foo
    image: foo/bar:tag
    env:
    - name: foo
      value: bar
    - name: travis_creds
      secretRef: ci_secret/travis

What's the solution

If there is a way we can modify an embedded struct in golang, that would be the best (I could not find one :( ) If there is not, then we should copy the entire PodSpec struct and modify the fields which we think we can make better, and the other fields can be left as it is, hence giving us the full power of the PodSpec and not losing out on any functionality.

This will be a pain to implement, because the PodSpec is not small, and we will have to assign a lot of parameters during conversion, but this will be a one time effort because our version of client-go is vendored, and we will only have to update our PodSpec whenever we update client-go, and it is not going to change much.

Thoughts?

CC: @kadel @surajssd @jstrachan @pradeepto

kadel commented 7 years ago

If there is a way we can modify an embedded struct in golang, that would be the best (I could not find one :( )

No, this is not possible.

This will be a pain to implement, because the PodSpec is not small, and we will have to assign a lot of parameters during conversion, but this will be a one time effort because our version of client-go is vendored, and we will only have to update our PodSpec whenever we update client-go, and it is not going to change much.

If we end up doing this that this part will have to be automated. We will have to figure out how to keep it in sync with PodSpec automatically, or with only little effort.

concaf commented 7 years ago

If there is a way we can modify an embedded struct in golang, that would be the best (I could not find one :( )

No, this is not possible.

@kadel I was wrong, I should have studied structs more carefully :rofl: , I just talked to @baijum and found out that when we embed a struct like -

type child struct {
    parent
    field2 bool // conflicting field from the parent struct
}

then the field in the child struct takes precedence and the conflicting field value in the parent struct is set to its zero value. I have a written a small snippet here - https://play.golang.org/p/L4eoQf33F1

So, using this, we can extend the PodSpec like -

    type openComposeEnvVariable struct {
        Key       string
        Value     *string
        SecretRef *SecretDef
    }

    type openComposeContainers struct {
        api_v1.Container
        Env []openComposeEnvVariable
    }

    type App struct {
        api_v1.PodSpec
        Containers []openComposeContainers
    }

WDYT about this? We don't need to automate anymore, and we get the entire PodSpec with full control over the spec.

kadel commented 7 years ago

hmm, interesting. Good to know ;-)

But we should be careful with what we replace. We can discuss this after we evaluate Suraj's POC