Closed cdrage closed 7 years ago
@cdrage so I'm not looking at it from "making containers a subkey" angle, maybe more like the discussions we are having at the Stateful Sets issue - #110.
So, maybe having controllers at the top level of the spec, like "deployment", "statefulSets", "jobs" at the top level, and everything inside them, including services.
For instance, services:
will be a part of statefulSets and deployments, but not jobs.
@containscafeine This is getting confusing between Kubernetes services and our definition of "services". Perhaps we need to draw a line to say that these are Kubernetes services and not a "services" definition.
But yeah, that's an alternative way that we could do it.
@cdrage yep, the services definition we have is the Kubernetes services, we don't have a services definition (meaning microservices or some such) (AFAIK)
@cdrage
@containscafeine This is getting confusing between Kubernetes services and our definition of "services". Perhaps we need to draw a line to say that these are Kubernetes services and not a "services" definition.
Actually these kedge services are Kubernetes services.
At root level we have:
Services []ServiceSpecMod `json:"services,omitempty"`
Where ServiceSpecMod
is:
type ServiceSpecMod struct {
api_v1.ServiceSpec `json:",inline"`
Name string `json:"name,omitempty"`
Ports []ServicePortMod `json:"ports"`
}
see the line:
api_v1.ServiceSpec `json:",inline"`
It is actually a Kubernetes service spec embedded in it.
As discussed in here https://github.com/kedgeproject/kedge/issues/110#issuecomment-313016324
there are two approaches here either add a root level key for each controller:
for deployment:
deployment:
containers:
- image: centos/httpd
for job
job:
containers:
- image: gitcrawler
vs
for deployment:
containers:
- image: centos/httpd
for job:
kind: job
containers:
- image: gitcrawler
Speaking for myself I am inclined towards the second approach. Because it is less indentation, also if user does not provide the kind
field then we can always default to deployment
, with first approach this default won't be possible.
Also in second approach we can do something similar to how kubernetes does the detection based on the kind field. So that we don't need to populate the app struct with everything, it will rather be a clean segregated structs depending on the kind everything will be happening.
Thinking about it, there are 2 approaches -
deployment:
name: httpd
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
job: name: piJob containers:
kind
or controller
field -
name: httpd
kind: deployment # optional, defaults to deployment
containers:
kind: job containers:
Let's keep this thread/issue for discussing the approach of how we gonna enable multiple controllers
@containscafeine you did not suggest what approach you wanna go with?
I'm inclining towards the second one, i.e. introducing a kind
or controller
keyword instead of increasing indentation with root level controller fields.
ping @kadel @jstrachan would love your feedback on this :)
Seems so messy, that's the problem :( People will have to memorize that if it is a "job" or anything that's not a service, it's not going to have the "root" key of services
.
For @containscafeine 's suggestion
:+1: for the first approach, the second approach get's confusing as suddenly you have "services" but in "job" you don't (everything is a root key? for restartPolicy nad parrallelism? what?)
As how @surajssd described, the services
key, at least within the code, is actually Kubernetes services. So why not split them up into separate root-keys based on functionality? (daemonSet
, statefulSet
, services
, jobs
, etc.)
For @containscafeine I believe this is wrong:
deployment:
name: httpd
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
job:
name: piJob
containers:
- image: perl
name: pival
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
It should be formatted like this:
deployment:
name: httpd
containers:
- image: centos/httpd
services:
name: httpd
containers:
- image: centos/httpd
type: NodePort
ports:
- port: 8080
job:
name: piJob
containers:
- image: perl
name: pival
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3s
Considering the first approach is to device them into separate root-level keys ^^.
Of course, with this example, we're manually deploying deployment
rather than having services
automatically generate that artifact.
As a rule we decided that one file (or one section of the file) is going to describe one application. In that sense, one application will always be just one Deployment or one StatefullSet. We don't need a way to define both in the same file.
I would like to keep it is without making containers subkey
@kadel How do you think we should implement the different kinds?
If we want separate files, ideally i'd be:
kind: job
then.
I got to this discussion later, so I'm still a bit confused but why we can't do jobs just like list of JobSpec under jobs key?
jobs:
- name: job1
template:
metadata:
name: pi
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
this will also solve https://github.com/kedgeproject/kedge/issues/114
@kadel Because it's confusing having containers as a root key as well as a subkey. Basically, if you defining a service
you're using it as a root key. If you're defining jobs as per #116 you're using it as a sub-key.
It should be the same between the two. Moving containers
as a subkey within services
solves both problems.
I'm more and more confused :-(
What do you mean by services
? Services don't have containers.
@kadel Yeah, they do. And like how you're confused, it confuses me too:
name: web
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
Should be:
name: web
services:
- name: httpd
containers:
- image: centos/httpd
type: NodePort
ports:
- port: 8080
No. The second one is wrong. Services don't have containers in it, it doesn't' make sense. It's Kubernetes Services. It's networking concept.
Ah okay. Hence the confusion because we have two definition of "services" within Kedge at the moment. One is similar to how it's the same as Docker Compose (App Services) and the other being Kubernetes (Services, networking like you said).
What implementation do you think we should go through with? @containscafeine 's or @surajssd 's?
Ah okay. Hence the confusion because we have two definition of "services" within Kedge at the moment.
There should be only one service in Kedge, and that is Kubernetes Service.
What implementation do you think we should go through with? @containscafeine 's or @surajssd 's?
In most cases, users will define Deployments. I would like to keep it as simple as possible for deployments.
Because of that I like what is @surajssd proposing as the second approach in https://github.com/kedgeproject/kedge/issues/119#issuecomment-313308792
But I don't think we need "kind" filed for Jobs. I would use this only for "StatefulSets"
@kadel There are other use-cases such as DaemonSets / Jobs / CronJobs which this would be useful for.
Jobs and CronJobs are a little bit different. They are not usually standalone applications, they complement application that is represented by Deployment.
This is also why Jobs have separate root level key as they are proposed right now.
DoeamonSets are really specific, they are used for cluster wide things, as such they are rarely used by regular developers. DaemonSets are almost exclusively used by cluster managers. I don't think we need some extra support for it beyond what is described in https://github.com/kedgeproject/kedge/issues/157
@kadel I believe adding kind would be beneficial to create extensive examples for deployments.
StatefulSets are becoming more popular as it's beginning to be the go-to for databases. Why not add kind
?
Limiting Kedge to only Services and Deployments would be limiting to what people can do with Kedge. Sure, we can add extras
, but isn't the point of Kedge to simplify k8s artifacts?
I'm not saying that Kedge shouldn't do StatefulSets. Only that I wouldn't worry about it for now. We can add it later if needed until then using 'extras' can be enough for a small percentage of people that actually need StatefulSet. Kedge have to have some form of 'extras' anyway, regardless what we do with StatefulSet.
As you are saying its used for databases. Developer's don't deploy databases that often, and if they do its one-time thing.
Its OK to force people to learn Kubernetes artifacts. Kedge is using Kubernetes artifacts everywhere. There is no field that is redefined its all just Kubernetes primitives. Simplification is there in the form that users don't have to write everything, a lot of fields can be computed or deduced from others.
I would like to see kind
field which defines the type of controller, does not matter if job
or deployment
or statefulset
. Because it makes the spec look very consistent.
If we consider making jobs
as root level field and creating list of job
object, it is confusing to see containers
field at root level for deployments
and also see it at job
level underneath as @cdrage raised similar concerns earlier. I am of the opinion that we should have controllers differentiated with field kind
.
Also we should not assume jobs
will always be associated with some application. If that was the case, there would already be something present in Kubernetes which dealt with this particular use-case.
Also all the controllers use the same pod spec, this also allows us to reuse the podSpec
that we have modified to our usage. We can use the same for these various other controllers.
So for the sake of consistency on the user side, we should add root level string field kind
and then accordingly parse the data into appropriate objects internally.
I would like to see kind field which defines the type of controller, does not matter if job or deployment or statefulset. Because it makes the spec look very consistent.
If we consider making jobs as root level field and creating list of job object, it is confusing to see containers field at root level for deployments and also see it at job level underneath as @cdrage raised similar concerns earlier. I am of the opinion that we should have controllers differentiated with field kind.
I feel that this will go away from the whole idea of defining an application, and stepping back closer to Kubernetes OPs view.
Also we should not assume jobs will always be associated with some application. If that was the case, there would already be something present in Kubernetes which dealt with this particular use-case.
I'm not saying that this will be always the case, but its most common. Kuberentes has different approach. Kuberentes is breaking everything into smallest possible pieces, you can't compare this with Kubernets approach.
Also all the controllers use the same pod spec, this also allows us to reuse the podSpec that we have modified to our usage. We can use the same for these various other controllers.
So for the sake of consistency on the user side, we should add root level string field kind and then accordingly parse the data into appropriate objects internally.
If we really add Kind object that we can go back to Kubernetes resources and just provide some additional shortcuts but keep the structure of the object intacted.
Otherwise we are creating some strange hybrid that will just be more and more confusing for everyone. This is how this looks to me.
@kadel
I'm against the kind
format, but rather changing the root keys to their respective kinds, my original proposal of:
deployment:
name: httpd
containers:
- image: centos/httpd
services:
name: httpd
containers:
- image: centos/httpd
type: NodePort
ports:
- port: 8080
job:
name: piJob
containers:
- image: perl
name: pival
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3s
Thus avoiding confusion of a weird "hybrid" between Kedge and Kubernetes artifacts.
@cdrage @kadel the more I think and talk about Jobs, it seems that they are generally not tied to one part of the application, in fact they are tied to multiple parts. e.g., populate the "database", so "web" can use it; train a "model", so the "API gateway" can use it. So use cases for jobs are mostly tied to >=2 parts of the broader application, and it makes more sense to me to go the "kind: Job" or "controller: Job" way in a separate file.
Also, IMHO, we are going in circles with this issue since everyone has a "different" opinion on the implementation and nothing is right or wrong, we need to decide on this. Should we ping more folks, ask for opinion, or maybe get into a meeting tomorrow and decide this?
Okay, so we are not able to reach a decision on this.
Ping @bparees @jstrachan, would love your opinion on this, thanks!
Problem Statement: Currently, we only support creating deployments in kedge, but at some point in time (when feature requests come in), we'd like to support other Kubernetes controllers as well, e.g. StatefulSets, Daemon Sets, etc. Currently, we need a way to define "Jobs" and "Cron Jobs" controllers in kedge.
Proposed Solutions:
jobs:
field to define jobs for that particular piece.name: httpd
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
jobs:
- containers:
- image: perl
name: pival
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
Cons:
name: httpd
deployment:
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
job:
containers:
- image: perl
name: pival
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
Cons:
controller
e.g. controller: job
, controller: statefulset
.
Every kedge file can define only one controller.
Controllers which are logically a part of the same file, e.g. jobs, can be separated by the YAML document separator ---
, and be included in the same file, or be a part of another file.httpd.yaml -
name: httpd
controller: deployment # optional, defaults to deployment
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
job.yaml -
controller: job
containers:
- image: perl
name: pival
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
Cons:
---
can be used to logically group such controllers, but it could be an abuse of the same.@containscafeine To note, this also reflects other implementations such as StatefulSets, DaemonSets, CronJobs when they are implemented in the future,
for example:
statefulSet:
- containers:
- image: mysql
name: mysql
replicas: 3
an application is what you want it to be. if you don't want your job to be part of an existing app because it's more "global" than that, define it in a second application. ie option 1.
this is assuming you are committed to the "application as a top level config container" concept for your DSL.
@bparees -
an application is what you want it to be. if you don't want your job to be part of an existing app because it's more "global" than that, define it in a second application. ie option 1.
do you mean option 3?
no I mean option 1. I would then envision having, for example, 3 files:
1) a file that defines stuff specific to microservice1
name: mymicroservice1
containers:
- image: centos/httpd
services:
- name: httpd
type: NodePort
ports:
- port: 8080
jobs:
- containers:
- image: perl
name: mymicroservice1specificjob
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
2) a file that defines stuff specific to microservice2
name: mymicroservice2
containers:
- image: centos/httpd
services:
- name: httpd2
type: NodePort
ports:
- port: 8081
jobs:
- containers:
- image: perl
name: mymicroservice2specificjob
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
3) a file that defines a job that's either global, shared by both of the above microservices, or unrelated to any of them:
name: globalresources
jobs:
- containers:
- image: perl
name: myglobaljob
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
parallelism: 3
We've had a lot of conversation about this issue, both here and at our Slack https://kedgeproject.slack.com/.
There were 3 options and all of those were "different", there was no right or wrong.
The most pressing argument against Option 1 was that it brings about inconsistency in the spec, in the sense that Kubernetes' Deployment controller is given special treatment compared to other controllers like StatefulSets and Jobs, because a deployment can be defined at the root level and not other controllers, those are supposed to have their own field. Every Kubernetes' controller should be able to define one microservice in its entirety. As for jobs, there are use cases where those could be tied to an application or could be totally different stories in themselves.
As for Option 3, it might not make sense when a job has to be tied with a deployment, since both the definitions have to be def
ined different files. Even though the ---
could be used for such a use case, but it could be an abuse of the same and kedge definition should support logically defining parts of an application that belong together without a separator of any sort.
In the end, we decided to vote, since everyone was very, very divided on which way to go forward with. The vote count was four for Option 3 and three for Option 1.
So, for now, we are going forward with Option 3, and time bounding it for 6 weeks till we get some feedback when we deploy real world applications. If it such happens that the UX does not make sense for Option 3, we will reiterate on this issue.
Closed by #195
We're running into confusion as per: #116 as well as #110 which causes issues such as these to occur:
Which IMO is really confusing.
I propose we make
containers
a subkey. So instead, we have this:Thus, we can also implement #110 (for example), by having separate root keys: