elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.79k stars 8.19k forks source link

[Infra/Logs UI] [R&D] Should we replace GraphQL? #36674

Closed jasonrhodes closed 5 years ago

jasonrhodes commented 5 years ago

We've discussed whether having GraphQL as part of the infrastructure and logs UI data layer is a good thing or whether we should remove it in favor of a more simple REST (or other?) interface. We should outline the basic pros and cons. I think it would be good to have the folks who have been working with this code (@simianhacker, @weltenwort, @skh, and @Kerry350) document the following answers:

"Keep GraphQL" side:

"Remove GraphQL" side:

Time-box: 2-3 hours

elasticmachine commented 5 years ago

Pinging @elastic/infrastructure-ui

weltenwort commented 5 years ago

Thanks for remembering to bring this up.

Current Usage

We're using GraphQL for all http API endpoints in the infra plugin except the rather new metrics explorer. The usage looks like this:

(Warning, opinions and speculation from here on)

Advantages

Disadvantages

Alternative

Considering the current usage outlined above, I think we could preserve the advantages and reduce disadvantages by using basic HTTP endpoints with JSON serialization in combination with https://github.com/gcanti/io-ts (already in Kibana's dependency list) for validation and type safety. I don't think the ReST paradigm matches our API very well (and I've rarely seen it implemented correctly). Using Joi for validation also is not ideal IMHO, because it (1) doesn't work in the browser and (2) only validates the request, but not the response and doesn't integrate with TypeScript. Instead, if we replaced GraphQL with HTTP routes, I would propose something like the following:

This would preserve the following characteristics on both client and server

while improving on problems we currently have with GraphQL

Thanks for reading this wall of text. I'm looking forward to hearing opinions and alternative proposals.

Kerry350 commented 5 years ago

"Keep GraphQL" side:

@weltenwort's answer to this was very exhaustive. All I'll add is there's now one additional non-GraphQL usage in the "IP to host" endpoint (this was only merged yesterday).

"Remove GraphQL" side:

I actually really like the method proposed by @weltenwort (for context we spoke a little about it at GAH too, and Felix explained the issues with Joi in the browser).

I don't think the ReST paradigm matches our API very well (and I've rarely seen it implemented correctly).

I do strongly agree with this. Now, I fall in to the "It's RESTful" camp when it's really not just as much as anyone else. But, if we're completely redefining how we do things here, we might as well be accurate 😄 I honestly can't see that for every resource we provide we'll do things like provide hypermedia links to transition that resource from one state to another (pagination), or to navigate to linked resources.

If we go endpoint by endpoint I don't think there's too much risk, at least no more than the Redux -> Hooks refactors that have been taking place.

weltenwort commented 5 years ago

We'd lose the ability to "play" with / test the API with GraphiQL.

Good point! This really helped me during development and testing and it might be difficult to replicate that without GraphQL. There are similar tools for HTTP APIs, but they also require some kind of machine-readable API specification or documentation to provide a comparable level of convenience.

If we go endpoint by endpoint I don't think there's too much risk, at least no more than the Redux -> Hooks refactors that have been taking place.

Agreed. The complexity of the migration might be a bit higher if we want to slim down the abstraction layers at the same time. But nothing that can't be handled with a bit of API testing discipline.

simianhacker commented 5 years ago

We'd lose the ability to "play" with / test the API with GraphiQL.

Good riddance! The GraphiQL enables us to be lazy. We shouldn't be playing with the API for testing/development, we should be writing API integration tests to trigger the endpoints for development.

I don't think the ReST paradigm matches our API very well (and I've rarely seen it implemented correctly).

Let's agree to stop calling it REST and just call it an HTTP API. Getting bogged down in the details of REST will end up paralyzing us into an endless analysis of how REST like we are, total waste of time.

Cognitive overhead: there is more to reason about in implementing something in GraphQL (schema, resolvers etc) than a simple endpoint function

This is the primary reason we are making this change. Since we are barely using the "killer" features of GraphQL it's pretty hard to justify asking new team members to learn yet another library (along with all the idiosyncrasies of Kibana). Everyone should know how HTTP API's work without much explanation.

Another related issue, the types that GraphQL script generates are practically unreadable and increases the complexity.

import { SourceQuery } from '../../../graphql/types';
export const useMetricsExplorerState = (
  source: SourceQuery.Query['source']['configuration'], // <-- Yuk!
  derivedIndexPattern: StaticIndexPattern
) => {
  ...
}

Furthermore, the type generation is not "automatic", we have to remember to re-generate the types when the schema changes, this can be frustrating to debug when things get out of sync or the output from the underlying library changes. Yet another moving piece we have to manage and maintain.

skh commented 5 years ago

Pro GraphQL

Contra GraphQL

Replacement

jasonrhodes commented 5 years ago

This is fantastic feedback, @infra-logs-ui team! I like everything that's laid out here, and in the end agree with the approach spelled out by @skh that we should try to move to a very simple HTTP API approach and then improve it as we go, although if there are parts of the implementation that @weltenwort outlined above that we think are simple enough to do, we can consider them. I confess I don't completely follow it so we should zoom about it when everyone is back from holiday in a week and begin to make a more structured plan for this work. I'll put something on the calendar.

Thanks!

weltenwort commented 5 years ago

An example of what it would look like with client-to-server typing: weltenwort/kibana#2

roncohen commented 5 years ago

what's the the runtime overhead of io-ts validation? I understood it would run through every JSON response to check that everything is the right type. For large responses (1000+ numbers and dates, for example), I'd assume this could add significant overhead if we're not careful.

weltenwort commented 5 years ago

I'm not aware of any benchmarks, but we could perform some. But since not performing validation is not really an alternative, we should compare against other validation libraries.

roncohen commented 5 years ago

But since not performing validation is not really an alternative, we should compare against other validation libraries.

Is it common to do run time type checking of APIs in production? As an aside, React propTypes are only validated in development. If there's a significant overhead from io-ts, could we use it in development mode only?

weltenwort commented 5 years ago

I consider validating API payloads a security best practice in production. Not blindly trusting user input (which anything from the network essentially is) is a basic principle IMHO. The apollo server and client did that for us so far, but when we move to a custom HTTP API we will have to manually include that validation step. I proposed io-ts for that because it not only performs the validation but also ensures the compiler is aware of the value types, thereby providing a single source of truth (like the GraphQL schema was before).

The analogy to React prop types is not valid, I think. React component props don't really represent a boundary of the system to an untrusted environment in the same way that a network API does.

roncohen commented 5 years ago

to be clear, i think it's great if we can get type checked API responses, but we need to be aware of the overhead

jfsiii commented 5 years ago

An example of what it would look like with client-to-server typing: weltenwort#2

@weltenwort I just saw this and it reminds me of the explorations I did in https://github.com/elastic/kibana/pull/40041 around using JSON Schema to describe the types. As the io-ts author outlines here, depending on one's preferences for which format should be the "source of truth", it's possible to either

I'd love to talk more about this when you get a moment. I think this could unlock some very useful features.

weltenwort commented 5 years ago

Absolutely, I'd be happy to discuss it when you have the time.

Thanks for linking the io-ts issue. From the experience we gathered with GraphQL I would prefer the variant where we define the io-ts type at runtime as the source of trutz. The separate compilation step is seen as one of the biggest pains in our current graphql setup, so a pure runtime setup would be preferable.

jfsiii commented 5 years ago

The separate compilation step is seen as one of the biggest pains in our current graphql setup, so a pure runtime setup would be preferable.

Totally agree about avoiding the compilation step. I was hoping for something like Enjoi which I used in that PR to convert from JSON Schema to Joi at runtime without a compilation step

https://github.com/elastic/kibana/blob/5301b2c66f066a5dbc6b7a445951702970e995c7/x-pack/legacy/plugins/integrations_manager/server/routes.ts#L18

https://github.com/elastic/kibana/blob/5301b2c66f066a5dbc6b7a445951702970e995c7/x-pack/legacy/plugins/integrations_manager/server/routes.ts#L34-L43

But I see that both io-ts-codegen and json-schema-to-typescript require writing out to disk. Which make sense I guess.

I was hoping to have a spec expressed in JSON Schema in implemented in other places/languages (Principle of Least Power, etc), but the problems caused by having them out of sync (loss of trust, false positives & negatives) and pain of keeping them in sync seem to shift to authoring in TS and exporting to JSON Schema as needed.

One item from that issue about starting with a runtime type:

However conversions might become tricky if there are recursive runtime types involved.

Can you think of any places where we have recursive runtime types?

weltenwort commented 5 years ago

Can you think of any places where we have recursive runtime types?

Not really. And we probably want to use a safe JSON serializer (with stable key order and cycle detection) in any case when doing this kind of conversion. Do you intend to serve the JSON schema on a well-know route or do you want to write it out to a file?

jfsiii commented 5 years ago

I don't even know if we'll need it now. My initial idea was have a spec in JSON Schema so it could be a) converted to Joi for validation, TS for types, etc and b) included in an OpenAPI definition

We've gone over how we'll try to address (a) in the last few comments and I just saw that there are a few references, especially https://github.com/elastic/kibana/issues/40623, for incorporating OpenAPI definitions.

I think I'll start with having Integrations Manager use io-ts and see where that leads.

weltenwort commented 5 years ago

Yes, that makes the most sense to me as well. We should be able to generate everything both at runtime and any other point in time from the runtime types.

elasticmachine commented 5 years ago

Pinging @elastic/infra-logs-ui