Closed kwiatekus closed 4 years ago
@kwiatekus I agree with you that CBS comes with some benefits like caching and composition of response, but also comes with some problems like mentioned abstractions and code's complexity - especially, when you, on every time for new query/subscription/mutation, must write this same code, when difference is only on types, graphql types or native CR's types like spec etc.
I would not give up graphql alone, but as you wrote, check if we need caching - for me, yes we need it. Using api server, without caching, sooner or later, can cause more problems than current problems with synchronisation in indexers, because this is the biggest problem in indexers and even k8s people don't hide it.
Recently I was wondering if is possible to write only one generic query for k8s get and list. The problem here is the Golang itself, which has no generics, but you can get around it, thinking outside the box. Every implementation of graphql under the hood use json format for retrieving only this data/fields which user want. We can use that information and operate on cbs side not on generic k8s type like runtime.Object
but unstructured
:) unstructured
has raw json format of resource in struct and based on it, we can reduce writing new queries to 0, but here is a place to discussion, what contract will we accept for those (get and list, and in future for mutations and subscriptions) queries. My proposition:
type Query {
genericGet(schema: String!, name: String!, namespace: String): Resource
genericList(
schema: String!
namespace: String
): [Resource!]!
}
where schema
is a type (kind and apiVersion) of retrieving resource(s) like serviceinstances
-> serviceinstances.service-catalog.k8s.io/v1beta1
, and Resource
type is a something like that:
type Resource {
apiVersion: String!
kind: String!
metadata: ResourceMetadata!
spec(fields: [ResourceFieldInput!]!, rootField: String = "spec"): JSON!
subResource(schema: String!, name: String!, namespace: String): Resource
subResources(
schema: String!,
namespace: String,
): [Resource!]!
raw: JSON!
parent: Resource
}
as you can see we base queries on resource schema, and under the hood use dynamicClient
for retrieve resource(s) with appropriate type, and then operate on unstructured
type.
I will create a proposal (if it is not a problem for you) in days in which I will describe my idea in details, including the use cases in the present CBS like nested queries serviceinstances
-> servicebindings
-> servicebindingusages
, filtering, sorting, etc... I also have an idea for better CBS modularization.
At the moment I have a branch with working described idea with list and query :) If you are interested, I will create a draft PR with that, and you'll be able to test it for yourself.
I missed something...
ResourceFieldInput
type has format:
input ResourceFieldInput {
key: String
path: String!
}
so then on query, you use it in this way:
query {
genericGet(schema: "pods/v1", name: "XYZ", namespace: "XYZ") {
metadata {
name
}
spec(fields: [{
key: "containers"
path: "spec.containers"
])
}
}
and you get only wanted spec.containers
data in output :)
"data": {
"metadata": {
"name": "XYZ"
}
spec: {
"containers": [...] // containers of pod
}
}
@magicmatatjahu If I get it correctly you suggest adding a graphql layer just for caching? Wouldn't it be better to use an existing solution, some kind of proxy which could cache api-server responses?
I'm also not convinced about the caching. Unfortunately, we don't know how much traffic to expect, but I'd assume less than ten users at once. Assuming we have slowly changing environment and browser caching done right I'd expect the traffic generated by users to be smaller than this generated by our controllers. We need to verify it experimentally somehow, though.
@sjanota I suggest to think cbs as a layer for api-server, yes. In existing solution
do you mean pure api-server
or actually CBS? If CBS, it won't change anything, but if as pure api-server, then you will have a problem with handling logic for bufering in proxy - and Indexers is something like a proxy with bufering and very well tested. Also indexers have a lot of pros like adding custom indexes for faster filtering for example, and pagination will be very easy to make. But if caching is really unnecessary, we can base solution on regular client-go without an indexer.
As I wrote, I wouldn't want to give up with graphql. One thing that automatically wins over the rest (for me), is poll only what you want, and it greatly reduces the weight of response. I can imagine situations when you create a request for pods on given namespace, where there are 200, and you only need their names and your response suddenly increases to about 1-3 mb.
And I agree with you, we must verify if caching is needed.
I think that we need a layer between UI and api-server anyway. The main question is how rich that layer should be and if it should be a more transparent layer with added authorization and caching, or full-blown abstraction. I support the direction with a dynamic graphQL schema proposed by @magicmatatjahu . Please consider also a dynamic kubernetes client capabilities: https://github.com/kubernetes/client-go/tree/master/examples/dynamic-create-update-delete-deployment - maybe it would help with resolving golang challenges. In my opinion we just need to make adding and modifying resources to CBS simpler. If it is too hard with existing solution maybe we can consider something even more dynamic. We get json from api-server, unmarshal it to golang structures and then serialize it to graphql (json) response. If you ask me to think outside the box I would do it in one line like this:
(req,res) => request.get('.../path/to/my/resource').then(r=>res.json(r.body.map(graphqlMapper))
For me Console Backend Service brings the following advantages:
Kubernetes Dashboard also has custom back-end service to have more complex logic on back-end side.
Using K8s API server directly would mean that:
So for me it's all about complexity. The more logic on back-end side, the better. Even when we started again from a scratch using directly API server, we would end up with custom back-end service anyway.
it slows down UI development
I would rather ask what's the reason of slowing down the development. Is it about:
Then we can identify the real issue.
What I am aware of in CBS when I was codeowner of the component, is that:
We was thinking about code generation which is totally possible for CRUD operations on K8s resources. It's totally possible, but unfortunately we hadn't had time for it. (BTW I don't really the idea of generic GraphQL API that @magicmatatjahu suggested). Sorry, @magicmatatjahu 😞
CBS is putting an abstraction layer on kyma API. It hides the entities' yaml structures from the user, making it hard for the user to learn how to automate configuration of kyma runtimes.
We have learned that it is not user's intention to create his workloads in KYMA in the convenient UI. The primary purpose of the UI is: inspection of the workloads enable new users to try out and understand kyma ( enable users to learn how to automate runtime configurations using KYMA API directly) Therefore, we want to challange the initial assumptions for the CBS and evaluate how much of: abstraction
I think it's something unrelated to back-end service for UI. It's about UI itself, what user sees, not what's happening under the hood. Maybe we should reduce abstraction on UI and display data how they are in K8s API server. But that's not about CBS, it's about UI.
@pkosiec One comment to your summary. Please note that we should not have any complex logic in CBS because kyma principle is UI and CLI parity. So one action in UI should be one kubectl command. One exception is aggregated view like some dashboard with namespace health status where you query multiple resources to give user overview. If you need to perform any complex action like expose your lambda with istio ingress gateway you should have a controller that is doing that in response to simple command (in this case kubectl create -f myapirule.yaml
).
The best example how UI and CLI parity works in practice is console.cloud.google.com. When you create any resource from UI you always have an option to copy to the clipboard the gcloud command doing the same thing.
@pbochynski Under the hood, in my POC, I use the dynamicClient for fetching resource by schema given from user like pods/v1
, and I make marhaslling to json in this same way as you described in suggestion: fetch array or one resource -> marshall to pure json -> filter fields of resource spec by jsonPath
- in this same way like in kubectl, ref: https://kubernetes.io/docs/reference/kubectl/jsonpath/) and it's only in some 500-600 lines :)
I will try today to make PR with this POC and you will try itself solution in action :)
@pkosiec For 3 day I thought about generation the GraphQL schema for appropriate resources like Pods, Deployment etc. Writing them manually will be painfull -> please see the jsonSchema of pod or deployment for example... So here I agree with you about generation. That this is one of solution, but here have a lot of problems to resolve (and for that I used generic spec(fields: [])
in my proposition):
json
tag). as I know gqlgen (tool for generating schema in cbs) at the moment doesn't support generation schema by golang type in runtime. So contribution, or generation of Golang types to jsonschema and then to graphql schema is only way. NOTE: Some CRD doesn't have a jsonschema like Knative Service (see the issue https://github.com/knative/serving/issues/912) and for that case I mentioned converting golang types.
We can accept that we have a generated graphql schema and regenerated models_gen
by gqlgen, but then we must write code for handling crud operations -> for more generic way, we must use another generator for that, which not exist on open source :) This is associated with a new tool to make and maintance. Also mapping one structure to another, that doesn't differ in anything but the name is a small problem, but problem, for example v1.PodSpec
-> graphql.PodSpec
. It can be done generically but probably through only by some generator.
For every new resources what you want to add in our cbs, you must in every time, go forward by steps 1 and then 2 and it increase dramatically our graphql schema and of course code generated by tool described in point 2 (and gqlgen)...
In perfect world, our schema should looks like:
type Resource {
apiVesrion: String!
kind: String!
metadata: Metadata!
}
type Pod implements Resource {
apiVesrion: String!
kind: String!
metadata: Metadata!
spec: PodSpec!
status: PodStatus!
}
type Deployment implements Resource {
apiVesrion: String!
kind: String!
metadata: Metadata!
spec: DeploymentSpec!
status: DeploymentStatus!
}
union Result = Pod | Deployment
type Query {
resource(name: String!, namespace: String, options: GeOptions): Result
}
// query in client side
query {
resource(name: "XYZ", namespace: "XYZ") {
apiVersion
kind
metadata {
name
}
on ... Pod {
spec {
containers {...}
}
}
on ... Deployment {
spec {
template {...}
}
}
}
}
I know that my solution doesn't go according to specification of graphql and destroys graphql types in spec
field, but for internally using in Kyma it is nedeed? Generation of graphql types for me at the moment for things like kubernetes resources (which are generic), make no sense and it is art for art's sake.
We only have the type hints in a dedicated tool like graphql playground. In client side you must write query manually without hints -> and for that is playground. But again: is it needed for internal use cases in Kyma? I don't think so...
I mainly want to get the effect in my solution, that the developer nothing has to do, generation no new code, etc. only add information (env or in configmap) so that cbs creates a new indexer for configurated k8s types (schema) at startup and that's it all.
In long term plan, we can try solution based on pure graphql types, but at the moment for speed up writing UI without an extra dozen hours of writing cbs with each new domain we can try to go into my solution and see: does it work? We can select some domain in UI and we will see if it speeds up our work or not, and what are the problems associated with it. I think I'll create proposal in next week 😄
@pbochynski Alright, but the complex logic I wrote about doesn't necessarily has to be connected to user action - it could be related to displaying some data as well.
I think firstly we all need to agree, whether we want to have back-end layer between K8s API and UI or not. Then we can proceed to choose how the back-end API server should look like.
I can see that we both agree about having API for UI, but let me explain for others what benefits bring the dedicated API for UI.
Let's take as an example simple ClusterServiceClasses view of Service Catalog. There is not much complexity, but still the difference should be visible.
In top right corner we have Service Instances count. How the view would be implemented in both cases? Let's assume we have 4 ClusterServiceClasses here.
Without CBS:
It would be good to remember about retrying failed calls.
The biggest downside I see is that it's all done on UI side.
With CBS:
You don't have to retry, as we have cache with up-to-date data, resync of it is automatic thanks to informers.
Main differences between the two approaches:
Regarding @magicmatatjahu proposal: the idea of GraphQL is to not have dynamic API, but to have well-defined typed contract between UI and API. That's why I would suggest code generation with dedicated GraphQL API per Kubernetes resource. I mean custom code generation tool based on go templates based on CRDs. Which will generate GraphQL schema, resolver, service etc. Service would be based on dynamic informers, so it shouldn't be a problem to generate all the code providing just the CRD, or at least bigger part of it.
Few cents from me:
existing solution
, I meant some open-source caching HTTP proxy which we don't have to develop and maintain. One thing I think wasn't mentioned explicitly. We are talking future. Like Kyma 2.0 kind of future. So things will get rewritten. A lot. I wouldn't limit our ideas just because they are far from where we are now.
@kwiatekus correct me if I misunderstood something.
Let me summarize a couple of points discovered up to this moment:
kubectl apply -f
The current implementation is not efficient (gently said) in point 4 and questionable in point 2.
@sjanota I will answer you only for sentence:
I'm just not sure if we can handle all the stuff in a generic way, like sub-resources, authorization etc.
I do not want to say that my solution is the best and etc. but I also think about this problem and for example: you want make query to list serviceInstances in some ns, with information about serviceBindings created for appropropriate serviceInstance. In current cbs you can do that, but if you want list some other resources like pods, secrets etc (this is only examples) base on serviceInstance, you cannot, and must write new subquery in go - this is only possibility right now.
In my solution you have:
type Resource {
...
subResources(
schema: String!,
namespace: String,
): [Resource!]!
...
}
and then you in client side make query like:
query {
genericList(schema: "serviceinstances.service-catalog.k8s.io/v1beta1", name: "XYZ", namespace: "XYZ") {
metadata {
name
}
subResources(schema: "pods/v1", name: "XYZ", namespace: "$parent.metadata.namespace") {
metadata {
name
}
}
}
}
Using $parent
json element you easy find wanted resource by data from parent (in here, this is serviceInstance
). But problem is that, that as developer you must find some correlations (labels, annotations) between these two types serviceInstance
and pod
.
And authorization is also not hard to implement for subresources.
Nice try. We could help with finding subresources by making it more standardized throughout Kyma resources. We probably should utilize labels as they are the means to filter in k8s.
Another idea worth at least consideration: splitting CBS into multiple GQL servers for better modularity. So for example server for applications would come with application-connector (or other application related module). API rules would be handled by api-gateway. Like in any other microservice infrastructure a single service exposes API to query and manipulate its model.
Such an approach should ideally make Kyma modules self-contained. It means easier extensibility and modularity. Here I'd also go probably for full-blown schema as @pkosiec described it. Why? We would keep whole API definition in separate independent chunks that make them easier to write and maintain.
Con: it makes it possibly harder to model relations between resources from different modules. On the other hand, we have such relations and need to handle them anyway. Such architecture just makes it explicit.
A couple of open questions to this idea:
Does it even make sense?
It mimics our current micro frontends, which are already spread across Kyma modules. This would also mean a mindset shift to e2e modules serving a specific domain instead of being frontend-/backend-focused.
@michal-hudy, you are one of the supporters, maybe you can extend/correct what I said.
@sjanota This idea is not something new as I also included it in my presentation "Building GraphQL API for Kubernetes resources" from Mid 2019. The presentation was basically a case study of Console Backend Service with an idea for a future development. I encourage you to take a look at the slides at least, to see and understand architectural decisions we've made while developing CBS.
https://docs.google.com/presentation/d/1vXdfxkvonEmgp2CGMi-zYqjZbiiAssFbdCkNk8GQhwk/edit?usp=sharing
However, keep in mind that multiple microservices will add another level of complexity, even if they will ease modularity. I would go back to the question I asked:
I would rather ask what's the reason of slowing down the development. Is it about: project structure (is it hard to maintain? is there too much boilerplate?) Go code (is the code too complex? are contributors familiar and confident with writing in Go? something else? Then we can identify the real issue.
Basically: what's prevents us from developing CBS further? What we want to change and why?
I agree with @pkosiec, GraphQL Federation can be useful for removing BackendModules
from CBS but definitely will not help with generating types and complexity. Also, more resources will be used, because there will be more pods, the cache will be not shared, etc... In such a case @magicmatatjahu approach is much better because CBS is used only for our UI, it is not designed for building user solutions on top of it so we don't have to use all features from GraphQL, it must solve only our problems.
Ok, let's have a small mental experiment. We need this additional layer only for UIs, right? Cool. We don't expect anyone else to use it. It can be tailored to UI needs. Why GraphQL then? why can't we have a simple REST service with responses tailored to our needs?
I don't think it changes much, but we haven't challenged the GQL assumption yet.
@pkosiec My reasons, why I picked up this battle:
@michal-hudy We may assume it is only for UIs, but what about MFs delivered by users? Or in addons? Do we know any use case where an extension may consume this API?
@a-thaler @pbochynski @valentinvieriu maybe you know something?
@sjanota Yes. In general this debate is for the future component serving UI proxy (ui- backend) Still, there might be some action items proposed out of it for the current solution ( i.e more advanced code generation )
@sjanota
Ok, let's have a small mental experiment. We need this additional layer only for UIs, right? Cool. We don't expect anyone else to use it. It can be tailored to UI needs. Why GraphQL then? why can't we have a simple REST service with responses tailored to our needs?
For very simple reason: In GraphQL if you want fetch only names for resources like pods, you fetch only names. Those operations (what you want fetch) is performed under the hood by graphql engine on server. In Rest solution, if you want fetch only some parts of pods, you must filter result by query params passed in query manually in you code logic.
More formally:
So you must write your own filter engine in Rest solution. I know that graphql have a some issues like caching queries in http protocol (graphql works usually on POST), but I think it has a more benefits that cons.
As soon as any users MF is allowed to consume it, the API must stay compatible, needs to be well documented and explorable and so on. I think we should not add that layer of complexity, the Kyma API are the CRDs, not the graphql API.
@magicmatatjahu or you just write a query that returns only names ;) But I agree that GraphQL is a comparable implementation effort with bigger benefits. I just want to make sure that we don't have any unchallenged assumptions.
Unfortunately separating CBS into many smaller microservices may bring us even more problems than we currently have, personally I like the phrase that "microservices solve organizational problems and cause technical ones", and I think that while CBS just keeps on growing, it has some kind of limit to it, it'll never be "enterprise-level" big. Just thinking about proper testing architecture that we would need to set up to test each and every microservice makes me scaried 😨.
@magicmatatjahu showed us this quick POC during one of our meetings, and I kinda like the idea that we have very little code to maintain, but I fear that we may find ourselves in a situation that we need to do something, and this infrastructure just doesn't let us do in an easy way. But hopefully it would not be the case, as it should be quite straightforward, as @pbochynski suggested (kubectl and UI parity).
As for the goodies that graphql brings us - smaller payloads, cache, easily explorable api from specialized UI, no complex logic in UI etc - those requirement need to be reevaludated, because many of them seems to be exageratted in favor of graphql. Complex logic in UI is just one reduce call on your data structure anyway, and partially losing those advantages in favor of less complex logic seems like a good idea to me (so good idea @magicmatatjahu)
One more thing - I would propose to explore even further code generation, as we need to have strong types in such a project as CBS
@sjanota
Ok, let's have a small mental experiment. We need this additional layer only for UIs, right? Cool. We don't expect anyone else to use it. It can be tailored to UI needs. Why GraphQL then? why can't we have a simple REST service with responses tailored to our needs?
I don't think it changes much, but we haven't challenged the GQL assumption yet.
Please see the presentation I attached. As I wrote, it's all explained there - all bigger architectural decisions in CBS.
Guys, I think this discussion is not going anywhere. I suggest taking it offline and:
What do you think?
Requirements for UI backend component are defined by expectations regarding how UI should evolve. Right now those expectations are a bit blurry, yet the key aspects are simple:
I will keep working on those
Thanks everybody for contribution.
I will work continue working on requirements for UI backend (implied from UI requirement).
Things that need to be improved:
Description
Console is using console backend service (cbs) to communicate with kyma resources (k8s API server). Among cbs biggest benefits:
Unfortunately, those benefits don't come for free
Reasons
We have learned that it is not user's intention to create his workloads in KYMA in the convenient UI. The primary purpose of the UI is:
Taking into account what we have learned, we want to challange the initial assumptions for the CBS and evaluate how much of: