kedgeproject / kedge

Kedge : Simple, Concise & Declarative Kubernetes Applications
Apache License 2.0
298 stars 41 forks source link

[spec change] Make containers a subkey #119

Closed cdrage closed 7 years ago

cdrage commented 7 years ago

We're running into confusion as per: #116 as well as #110 which causes issues such as these to occur:

name: web

containers:
- image: centos/httpd

services:
- name: httpd
  type: NodePort
  ports:
  - port: 8080

jobs:
- containers:
  - image: perl
    name: pival
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3

Which IMO is really confusing.

I propose we make containers a subkey. So instead, we have this:

name: web                                                                                                                                                                                                                                                                         

services:                                                                                                                                                                                                                                                                         
  - containers:                                                                                                                                                                                                                                                                   
    - image: centos/httpd                                                                                                                                                                                                                                                         
      name: httpd                                                                                                                                                                                                                                                                 
  type: NodePort                                                                                                                                                                                                                                                                  
  ports:                                                                                                                                                                                                                                                                          
  - port: 8080                                                                                                                                                                                                                                                                    

jobs:                                                                                                                                                                                                                                                                             
  - containers:                                                                                                                                                                                                                                                                   
    - image: perl                                                                                                                                                                                                                                                                 
      name: pival                                                                                                                                                                                                                                                                 
      command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]                                                                                                                                                                                                               
  restartPolicy: Never                                                                                                                                                                                                                                                            
  parallelism: 3

Thus, we can also implement #110 (for example), by having separate root keys:

name: web                                                                                                                                                                                                                                                                         

services:                                                                                                                                                                                                                                                                         
  - containers:                                                                                                                                                                                                                                                                   
    - image: centos/httpd                                                                                                                                                                                                                                                         
  name: httpd                                                                                                                                                                                                                                                                     
  type: NodePort                                                                                                                                                                                                                                                                  
  ports:                                                                                                                                                                                                                                                                          
    - port: 8080                                                                                                                                                                                                                                                                  

jobs:                                                                                                                                                                                                                                                                             
  - containers:                                                                                                                                                                                                                                                                   
    - image: perl                                                                                                                                                                                                                                                                 
      name: pival                                                                                                                                                                                                                                                                 
      command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]                                                                                                                                                                                                               
  restartPolicy: Never                                                                                                                                                                                                                                                            
  parallelism: 3                                                                                                                                                                                                                                                                  

statefulSet:                                                                                                                                                                                                                                                                      
  - containers:                                                                                                                                                                                                                                                                   
    - image: mysql                                                                                                                                                                                                                                                                
  name: mysql                                                                                                                                                                                                                                                                     
  replicas: 3
concaf commented 7 years ago

@cdrage so I'm not looking at it from "making containers a subkey" angle, maybe more like the discussions we are having at the Stateful Sets issue - #110.

So, maybe having controllers at the top level of the spec, like "deployment", "statefulSets", "jobs" at the top level, and everything inside them, including services.

For instance, services: will be a part of statefulSets and deployments, but not jobs.

cdrage commented 7 years ago

@containscafeine This is getting confusing between Kubernetes services and our definition of "services". Perhaps we need to draw a line to say that these are Kubernetes services and not a "services" definition.

But yeah, that's an alternative way that we could do it.

concaf commented 7 years ago

@cdrage yep, the services definition we have is the Kubernetes services, we don't have a services definition (meaning microservices or some such) (AFAIK)

surajssd commented 7 years ago

@cdrage

@containscafeine This is getting confusing between Kubernetes services and our definition of "services". Perhaps we need to draw a line to say that these are Kubernetes services and not a "services" definition.

Actually these kedge services are Kubernetes services.

At root level we have:

Services          []ServiceSpecMod   `json:"services,omitempty"`

Where ServiceSpecMod is:

type ServiceSpecMod struct {
    api_v1.ServiceSpec `json:",inline"`
    Name               string           `json:"name,omitempty"`
    Ports              []ServicePortMod `json:"ports"`
}

see the line:

    api_v1.ServiceSpec `json:",inline"`

It is actually a Kubernetes service spec embedded in it.

surajssd commented 7 years ago

As discussed in here https://github.com/kedgeproject/kedge/issues/110#issuecomment-313016324

there are two approaches here either add a root level key for each controller:

First approach

for deployment:

deployment:
  containers:
  - image: centos/httpd

for job

job:
  containers:
  - image: gitcrawler

vs

Second approach

for deployment:

containers:
- image: centos/httpd

for job:

kind: job
containers:
- image: gitcrawler

Speaking for myself I am inclined towards the second approach. Because it is less indentation, also if user does not provide the kind field then we can always default to deployment, with first approach this default won't be possible.

Also in second approach we can do something similar to how kubernetes does the detection based on the kind field. So that we don't need to populate the app struct with everything, it will rather be a clean segregated structs depending on the kind everything will be happening.

concaf commented 7 years ago

Thinking about it, there are 2 approaches -

job: name: piJob containers:

surajssd commented 7 years ago

Let's keep this thread/issue for discussing the approach of how we gonna enable multiple controllers

surajssd commented 7 years ago

@containscafeine you did not suggest what approach you wanna go with?

concaf commented 7 years ago

I'm inclining towards the second one, i.e. introducing a kind or controller keyword instead of increasing indentation with root level controller fields.

ping @kadel @jstrachan would love your feedback on this :)

cdrage commented 7 years ago

Seems so messy, that's the problem :( People will have to memorize that if it is a "job" or anything that's not a service, it's not going to have the "root" key of services.

For @containscafeine 's suggestion

:+1: for the first approach, the second approach get's confusing as suddenly you have "services" but in "job" you don't (everything is a root key? for restartPolicy nad parrallelism? what?)

As how @surajssd described, the services key, at least within the code, is actually Kubernetes services. So why not split them up into separate root-keys based on functionality? (daemonSet, statefulSet, services, jobs, etc.)

cdrage commented 7 years ago

For @containscafeine I believe this is wrong:

deployment:
  name: httpd
  containers:
  - image: centos/httpd
  services:
  - name: httpd
    type: NodePort
    ports:
    - port: 8080

job:
  name: piJob
  containers:
  - image: perl
    name: pival
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3

It should be formatted like this:

deployment:
  name: httpd
  containers:
  - image: centos/httpd

services:
 name: httpd
  containers:
  - image: centos/httpd
  type: NodePort
  ports:
    - port: 8080

job:
  name: piJob
  containers:
  - image: perl
    name: pival
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3s

Considering the first approach is to device them into separate root-level keys ^^.

Of course, with this example, we're manually deploying deployment rather than having services automatically generate that artifact.

kadel commented 7 years ago

As a rule we decided that one file (or one section of the file) is going to describe one application. In that sense, one application will always be just one Deployment or one StatefullSet. We don't need a way to define both in the same file.

I would like to keep it is without making containers subkey

cdrage commented 7 years ago

@kadel How do you think we should implement the different kinds?

If we want separate files, ideally i'd be:

kind: job

then.

kadel commented 7 years ago

I got to this discussion later, so I'm still a bit confused but why we can't do jobs just like list of JobSpec under jobs key?

jobs:
 - name: job1
   template:
      metadata:
        name: pi
      spec:
        containers:
        - name: pi
          image: perl
          command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
        restartPolicy: Never

this will also solve https://github.com/kedgeproject/kedge/issues/114

cdrage commented 7 years ago

@kadel Because it's confusing having containers as a root key as well as a subkey. Basically, if you defining a service you're using it as a root key. If you're defining jobs as per #116 you're using it as a sub-key.

It should be the same between the two. Moving containers as a subkey within services solves both problems.

kadel commented 7 years ago

I'm more and more confused :-( What do you mean by services? Services don't have containers.

cdrage commented 7 years ago

@kadel Yeah, they do. And like how you're confused, it confuses me too:

name: web

containers:
- image: centos/httpd

services:
- name: httpd
  type: NodePort
  ports:
  - port: 8080

Should be:

name: web

services:
- name: httpd
  containers:
   - image: centos/httpd
  type: NodePort
  ports:
  - port: 8080
kadel commented 7 years ago

No. The second one is wrong. Services don't have containers in it, it doesn't' make sense. It's Kubernetes Services. It's networking concept.

cdrage commented 7 years ago

Ah okay. Hence the confusion because we have two definition of "services" within Kedge at the moment. One is similar to how it's the same as Docker Compose (App Services) and the other being Kubernetes (Services, networking like you said).

What implementation do you think we should go through with? @containscafeine 's or @surajssd 's?

kadel commented 7 years ago

Ah okay. Hence the confusion because we have two definition of "services" within Kedge at the moment.

There should be only one service in Kedge, and that is Kubernetes Service.

What implementation do you think we should go through with? @containscafeine 's or @surajssd 's?

In most cases, users will define Deployments. I would like to keep it as simple as possible for deployments.

Because of that I like what is @surajssd proposing as the second approach in https://github.com/kedgeproject/kedge/issues/119#issuecomment-313308792

But I don't think we need "kind" filed for Jobs. I would use this only for "StatefulSets"

cdrage commented 7 years ago

@kadel There are other use-cases such as DaemonSets / Jobs / CronJobs which this would be useful for.

kadel commented 7 years ago

Jobs and CronJobs are a little bit different. They are not usually standalone applications, they complement application that is represented by Deployment.

This is also why Jobs have separate root level key as they are proposed right now.

DoeamonSets are really specific, they are used for cluster wide things, as such they are rarely used by regular developers. DaemonSets are almost exclusively used by cluster managers. I don't think we need some extra support for it beyond what is described in https://github.com/kedgeproject/kedge/issues/157

cdrage commented 7 years ago

@kadel I believe adding kind would be beneficial to create extensive examples for deployments.

StatefulSets are becoming more popular as it's beginning to be the go-to for databases. Why not add kind?

Limiting Kedge to only Services and Deployments would be limiting to what people can do with Kedge. Sure, we can add extras, but isn't the point of Kedge to simplify k8s artifacts?

kadel commented 7 years ago

I'm not saying that Kedge shouldn't do StatefulSets. Only that I wouldn't worry about it for now. We can add it later if needed until then using 'extras' can be enough for a small percentage of people that actually need StatefulSet. Kedge have to have some form of 'extras' anyway, regardless what we do with StatefulSet.

As you are saying its used for databases. Developer's don't deploy databases that often, and if they do its one-time thing.

Its OK to force people to learn Kubernetes artifacts. Kedge is using Kubernetes artifacts everywhere. There is no field that is redefined its all just Kubernetes primitives. Simplification is there in the form that users don't have to write everything, a lot of fields can be computed or deduced from others.

surajssd commented 7 years ago

I would like to see kind field which defines the type of controller, does not matter if job or deployment or statefulset. Because it makes the spec look very consistent.

If we consider making jobs as root level field and creating list of job object, it is confusing to see containers field at root level for deployments and also see it at job level underneath as @cdrage raised similar concerns earlier. I am of the opinion that we should have controllers differentiated with field kind.

Also we should not assume jobs will always be associated with some application. If that was the case, there would already be something present in Kubernetes which dealt with this particular use-case.

Also all the controllers use the same pod spec, this also allows us to reuse the podSpec that we have modified to our usage. We can use the same for these various other controllers.

So for the sake of consistency on the user side, we should add root level string field kind and then accordingly parse the data into appropriate objects internally.

kadel commented 7 years ago

I would like to see kind field which defines the type of controller, does not matter if job or deployment or statefulset. Because it makes the spec look very consistent.

If we consider making jobs as root level field and creating list of job object, it is confusing to see containers field at root level for deployments and also see it at job level underneath as @cdrage raised similar concerns earlier. I am of the opinion that we should have controllers differentiated with field kind.

I feel that this will go away from the whole idea of defining an application, and stepping back closer to Kubernetes OPs view.

Also we should not assume jobs will always be associated with some application. If that was the case, there would already be something present in Kubernetes which dealt with this particular use-case.

I'm not saying that this will be always the case, but its most common. Kuberentes has different approach. Kuberentes is breaking everything into smallest possible pieces, you can't compare this with Kubernets approach.

Also all the controllers use the same pod spec, this also allows us to reuse the podSpec that we have modified to our usage. We can use the same for these various other controllers.

So for the sake of consistency on the user side, we should add root level string field kind and then accordingly parse the data into appropriate objects internally.

kadel commented 7 years ago

If we really add Kind object that we can go back to Kubernetes resources and just provide some additional shortcuts but keep the structure of the object intacted.

Otherwise we are creating some strange hybrid that will just be more and more confusing for everyone. This is how this looks to me.

cdrage commented 7 years ago

@kadel

I'm against the kind format, but rather changing the root keys to their respective kinds, my original proposal of:

deployment:
  name: httpd
  containers:
  - image: centos/httpd

services:
 name: httpd
  containers:
  - image: centos/httpd
  type: NodePort
  ports:
    - port: 8080

job:
  name: piJob
  containers:
  - image: perl
    name: pival
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3s

Thus avoiding confusion of a weird "hybrid" between Kedge and Kubernetes artifacts.

concaf commented 7 years ago

@cdrage @kadel the more I think and talk about Jobs, it seems that they are generally not tied to one part of the application, in fact they are tied to multiple parts. e.g., populate the "database", so "web" can use it; train a "model", so the "API gateway" can use it. So use cases for jobs are mostly tied to >=2 parts of the broader application, and it makes more sense to me to go the "kind: Job" or "controller: Job" way in a separate file.

Also, IMHO, we are going in circles with this issue since everyone has a "different" opinion on the implementation and nothing is right or wrong, we need to decide on this. Should we ping more folks, ask for opinion, or maybe get into a meeting tomorrow and decide this?

concaf commented 7 years ago

Okay, so we are not able to reach a decision on this.

Ping @bparees @jstrachan, would love your opinion on this, thanks!

Problem Statement: Currently, we only support creating deployments in kedge, but at some point in time (when feature requests come in), we'd like to support other Kubernetes controllers as well, e.g. StatefulSets, Daemon Sets, etc. Currently, we need a way to define "Jobs" and "Cron Jobs" controllers in kedge.

Proposed Solutions:

  1. Since jobs could be tied to one part of the application (e.g. database), and kedge promotes defining each part of the broader application in a separate file, we could introduce a root level jobs: field to define jobs for that particular piece.
name: httpd
containers:
- image: centos/httpd
services:
- name: httpd
  type: NodePort
  ports:
  - port: 8080
jobs:
- containers:
  - image: perl
    name: pival
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3

Cons:

  1. Introduce root level fields for controllers, like the following -
name: httpd

deployment:
  containers:
  - image: centos/httpd
  services:
  - name: httpd
    type: NodePort
    ports:
    - port: 8080

job:
  containers:
  - image: perl
    name: pival
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3

Cons:

  1. Introduce an optional root level field controller e.g. controller: job, controller: statefulset. Every kedge file can define only one controller. Controllers which are logically a part of the same file, e.g. jobs, can be separated by the YAML document separator ---, and be included in the same file, or be a part of another file.

httpd.yaml -

name: httpd
controller: deployment # optional, defaults to deployment
containers:
- image: centos/httpd
services:
- name: httpd
  type: NodePort
  ports:
  - port: 8080

job.yaml -

controller: job
containers:
- image: perl
  name: pival
  command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3

Cons:

cdrage commented 7 years ago

@containscafeine To note, this also reflects other implementations such as StatefulSets, DaemonSets, CronJobs when they are implemented in the future,

for example:

statefulSet:                                                                                                                                                                                                                                                                      
  - containers:                                                                                                                                                                                                                                                                   
    - image: mysql                                                                                                                                                                                                                                                                
  name: mysql                                                                                                                                                                                                                                                                     
  replicas: 3
bparees commented 7 years ago

an application is what you want it to be. if you don't want your job to be part of an existing app because it's more "global" than that, define it in a second application. ie option 1.

this is assuming you are committed to the "application as a top level config container" concept for your DSL.

concaf commented 7 years ago

@bparees -

an application is what you want it to be. if you don't want your job to be part of an existing app because it's more "global" than that, define it in a second application. ie option 1.

do you mean option 3?

bparees commented 7 years ago

no I mean option 1. I would then envision having, for example, 3 files:

1) a file that defines stuff specific to microservice1

name: mymicroservice1
containers:
- image: centos/httpd
services:
- name: httpd
  type: NodePort
  ports:
  - port: 8080
jobs:
- containers:
  - image: perl
    name: mymicroservice1specificjob
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3

2) a file that defines stuff specific to microservice2

name: mymicroservice2
containers:
- image: centos/httpd
services:
- name: httpd2
  type: NodePort
  ports:
  - port: 8081
jobs:
- containers:
  - image: perl
    name: mymicroservice2specificjob
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3

3) a file that defines a job that's either global, shared by both of the above microservices, or unrelated to any of them:

name: globalresources
jobs:
- containers:
  - image: perl
    name: myglobaljob
    command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
  restartPolicy: Never
  parallelism: 3
concaf commented 7 years ago

We've had a lot of conversation about this issue, both here and at our Slack https://kedgeproject.slack.com/.

There were 3 options and all of those were "different", there was no right or wrong.

The most pressing argument against Option 1 was that it brings about inconsistency in the spec, in the sense that Kubernetes' Deployment controller is given special treatment compared to other controllers like StatefulSets and Jobs, because a deployment can be defined at the root level and not other controllers, those are supposed to have their own field. Every Kubernetes' controller should be able to define one microservice in its entirety. As for jobs, there are use cases where those could be tied to an application or could be totally different stories in themselves.

As for Option 3, it might not make sense when a job has to be tied with a deployment, since both the definitions have to be def ined different files. Even though the --- could be used for such a use case, but it could be an abuse of the same and kedge definition should support logically defining parts of an application that belong together without a separator of any sort.

In the end, we decided to vote, since everyone was very, very divided on which way to go forward with. The vote count was four for Option 3 and three for Option 1.

So, for now, we are going forward with Option 3, and time bounding it for 6 weeks till we get some feedback when we deploy real world applications. If it such happens that the UX does not make sense for Option 3, we will reiterate on this issue.

concaf commented 7 years ago

Closed by #195