jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform
https://www.jaegertracing.io/
Apache License 2.0
20.24k stars 2.41k forks source link

[Experimental] Provide GraphQL query service #169

Open saminzadeh opened 7 years ago

saminzadeh commented 7 years ago

Instead of an arbitrary "RESTful" HTTP API service, I think a better solution would be to implement the query service over GraphQL.

Here is my sample implementation of what the service would like:

https://launchpad.graphql.com/wxn5zk8mz HTTP Endpoint: https://wxn5zk8mz.lp.gql.zone/graphql

Some of the benefits:

This will help solve the following issues as well: https://github.com/uber/jaeger/issues/158 https://github.com/uber/jaeger/issues/123

yurishkuro commented 5 years ago

@saminzadeh any thoughts about https://rejoiner.io/ ?

saminzadeh commented 5 years ago

@saminzadeh any thoughts about https://rejoiner.io/ ?

@yurishkuro I'm aware but haven't had time to research it yet. Looks very promising and worth investigating

rubenvp8510 commented 5 years ago

@yurishkuro Wondering if is still plans to do this, I would like to contribute and work on this one.

yurishkuro commented 5 years ago

We don't have active plans to move in this direction. I think we need better justification of the costs/benefits before we can undertake this work. Also, we now support gRPC API for querying, so I would like to deprecate the old REST API, but we have not decided what should replace it.

cc @tiffon @everett980 - do you have any thoughts on this?

rubenvp8510 commented 5 years ago

Well, I was thinking this could be used by the UI, not sure what is the status of the gRPC in the browsers.

yurishkuro commented 5 years ago

@rubenvp8510 sure, but we need to get some very useful new capabilities that we cannot get otherwise, if we decide to undertake this significant renovation.

jpkrohling commented 5 years ago

but we need to get some very useful new capabilities that we cannot get otherwise

Being able to retrieve only a limited set of fields (only operation names, for instance) is already a new useful capability, IMO.

pavolloffay commented 3 years ago

We (Red Hat) would like to move this forward and introduce GraphQL API.

Let me describe the use case: There will be several projects that will use Jaeger query API (Kiali(already uses Jaeger API), Observatorium/OpenShift developer console).

The model served by the new API should be OTLP traces.

Once the API is stable the current query REST API should be deprecated/removed.

Why GraphQL?

The GraphQL clients fully control query (query and what data is being fetched) and it's easy to implement a gateway that composes multiple GrapQL schemas. Similar to REST/OpenAPI it provides out-of-the-box documentation of the exposed API. One downside of GraphQL vs gRPC is generating clients, this seems to be harder and less supported in GraphQL for some languages (e.g. go). This might not be an issue because the clients of query API will be UI or proxy. Also, this will improve over time as the technology matures.

Which libraries should we use?

joe-elliott commented 3 years ago

Great to see some progress on this. We have noticed in particular that sometimes very large traces retrieved in the search pane can slow it down considerably. Only one question/thought:

Once the API is stable the current query REST API should be deprecated/removed.

I"m not sure I agree with this. Restful APIs are extremely easy and predictable to interact with with simple tools like curl. I only know the basics of graphql, but it seems like it will up the complexity a bit?

Do we think it would be burdensome to maintain both? It would be a shame to lose curl http://blerg/api/traces/<traceid>.

jkowall commented 3 years ago

Good idea, I never saw this issue before! 💘

Lots of upside, but the downside aside from what has been mentioned is that if you are transferring a lot of data gRPC would be much more efficient in terms of processing and network transfer not to mention go usage. I think the self-documenting feature of GraphQL is great and improves upon REST, making it a bit easier to integrate. Also, it would be great to use schemas like Otel in the API data model possibly to make it even more compatible, just an idea.

Dropping the existing API might be a good idea in a few years, but deprecation takes time to make sure people have moved over. I think folks have been using it and like Rest, GraphQL is still early IMO.

yurishkuro commented 3 years ago

@jkowall I haven't tried it myself, but https://rejoiner.io/ mentioned in earlier comments claims to serve GraphQL over gRPC as well.

I agree that deprecating the existing REST API is not necessary, it's a very small surface to maintain. If we change anything, then I think we should think about which data model the GraphQL (and REST, potentially) would serve, because now in REST we serve a bespoke JSON model without a publicly defined schema. My preference would be to invest into serving an OTEL-compatible data model, which will future-proof the pipeline. Once such GraphQL API is available we can refactor Jaeger UI to use it instead of the existing bespoke JSON model.

pavolloffay commented 3 years ago

Thanks for the feedback. It seems there is some consensus:

As the next step, I will propose schema and start working on the implementation.

pavolloffay commented 3 years ago

I have submitted POC PR https://github.com/jaegertracing/jaeger/pull/3051.

Using OTLP Trace JSON with GraphQL is problematic. This is because the OTLP Trace JSON does not use standard JSON annotations but instead it uses protobuf annotation to serialize e.g.:

ResourceSpans []*v1.ResourceSpans `protobuf:"bytes,1,rep,name=resource_spans,json=resourceSpans,proto3" json:"resource_spans,omitempty"`

This can be solved by using a cusom marshaller for the high-level object e.g. ExportTraceServiceRequest. However then selecting object fields in the GraphQL query does not work OOTB (a workaround is to manuallu set object fields to null/empty based on the query).

Example query

query findTraces {
  traces(service: "payments", startMicros: 10, endMicros:150) {
    name
    traceId
  }
}

Selecting response structure is a major GraohQL advantage, therefore I don't think we should go forward with GraphQL if we want to use OTLP Trace JSON. Another downside is the structure of the response (see below). The response is always wrapped under data.<query name> - which means that the response data cannot be directly pushed back to the collector.

{
  "data": {
    "<queryName>/getTrace": {
        "resourceSpans": {}
    }
}

Based on the above I propose to either use GRPC with grpc-gateway or to define new REST API with OTLP Trace JSON (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md#otlphttp).

yurishkuro commented 3 years ago

I am not sure I follow your arguments against graphql. The annotations is a technical detail that we can find a solution for, in the worst case by defining our own types with needed annotations which serialize to the same JSON as otel proto (which we'd need to do anyway if we go with custom rest api).

Submitting graphql output back to collector isn't the goal since the output intentionally can be a sparse trace. We can always have a regular endpoint like we have now for retrieving full trace.

pavolloffay commented 3 years ago

I am not sure I follow your arguments against graphql. The annotations is a technical detail that we can find a solution for, in the worst case by defining our own types with needed annotations which serialize to the same JSON as otel proto (which we'd need to do anyway if we go with custom rest api).

Defining our own type will again unnecessarily increase maintenance. Having a duplicated model just to use the correct JSON annotations does not feel right. Also if we switch internally to OTLP it will need an additional model translation. I will have a look of the proto compiler can be configured to use the same JSON annotations as proto uses.

(which we'd need to do anyway if we go with custom rest api).

I am not 100% sure, in the custom REST API we should control serialization, though I am not sure if spec generation would take into account protobuf annotations, probably it does since the grpc-gateway should be able to generate the spec as well.

I would avoid custom REST API if we can simply use grpc-gateway. The performance issue that you mentioned does not seem like a blocker - especially for the query API.