openzipkin / brave

Java distributed tracing implementation compatible with Zipkin backend services.
Apache License 2.0
2.36k stars 713 forks source link

Elasticsearch client instrumentation #1226

Open gquintana opened 4 years ago

gquintana commented 4 years ago

Feature: Instrument Elasticsearch REST Client to push traceId/spanId in Elasticsearch requests.

Rational Il would allow to trace Elasticsearch client requests like MySQL on MongoDB requests. Elasticsearch requests are plain HTTP REST requests.

Prior Art

Elasticsearch REST Client 7.x is based on Apache Async HTTP Client 4.1, it looks very close to HTTP Async Client instrumentation

However Elasticsearch HttpClientConfigCallback allows to tune the HttpAsyncClientBuilder but not create a new one.

builder.setHttpClientConfigCallback(new HttpClientConfigCallback() {
        @Override
        public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
            // Tune HTTP client
            return httpClientBuilder;  
        }
    });

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-initialization.html

codefromthecrypt commented 4 years ago

thanks. we'd need some folks interested in order to justify the maintenance (rule of 3 meaning at least 3 not co-employed that would use this).

Some questions to guide the impl (or a temporary one!)

Is there any metadata at this level of abstraction about the query or pure HTTP. I think when I last looked it had no knowledge of elasticsearch concepts except perhaps discovery of nodes.

gquintana commented 4 years ago

Regarding the last question, AFAICT the purpose of Elasticsearch low level client is to handle connection pooling, authentication, load balancing, fail over and node discovery. See https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-low.html

All Elasticsearch querying concepts are in the high level client which wraps the low level one. This addon library allows to create request bodies and parse response bodies, but brings all Elasticsearch and Lucene libs as dependencies.

codefromthecrypt commented 4 years ago

gotcha.. so basically we're looking at normal http client instrumentation plus perhaps a tag/service name component depending on the service discovery feature

jcchavezs commented 4 years ago

I wrote one instrumentation for the go-elasticsearch library (see https://github.com/jcchavezs/zipkin-instrumentation-go-elasticsearch). It is mainly an HTTP client instrumentation with some tuning, e.g. span naming based on the operation and some optional tags:

In our use case the main use case was to correlate the span duration with the total shards being queried.

I'd say instead of allowing tagging the body of the query, it would be better to accept a body parser where it receives a set of bytes (at least in go, allowing the user to access the body directly from the request could end up in a memory leak if it is not closed correctly).

gquintana commented 4 years ago

In my case analyzing request/response is nice to have, but clearly not required. We already have metrics with search total hits, aggregation bucket count, indexing bulk size, total shards... Handling all ES request/response types looks tedious.

Having just HTTP components (verb, uri, status, duration) should be enough in my case.

codefromthecrypt commented 4 years ago

@gquintana any chance you can raise an issue with elastic for BYOC (bring your own client). The reason we override the builder is that there are some scoping gaps if you don't. Even if they refuse at least we asked?

codefromthecrypt commented 4 years ago

ex it isn't obvious here why they cannot allow the client to be overridden as an alternative to piecemeal: https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClientBuilder.java#L199-L223

codefromthecrypt commented 4 years ago

meanwhile I think that method could be overridden anyway with aspectj should you have it in the classpath (not sure limitations of overriding private methods)

gquintana commented 4 years ago

I asked to Elastic customer support, but as they have a concurrent solution, I don't expect too much.

In the mean time, do you think it's possible to wrap and decorate the HttpAsyncClientBuilder with custom code?

codefromthecrypt commented 4 years ago

In the mean time, do you think it's possible to wrap and decorate the HttpAsyncClientBuilder with custom code?

The could be problems with that as they have final methods.. ex you can't do the normal override then dispatch on methods like addInterceptorFirst...

codefromthecrypt commented 4 years ago

maybe you can force the issue a bit as it is anti-competitive to say no just because they have elastic APM? Maybe there's a github repo or a public issue where they can more publicly say yes or no?

codefromthecrypt commented 4 years ago

There are two options meanwhile:

I'd really like an elastic engineer, perhaps @felixbarny to comment on why it is important to prevent bring-your-own unless using an agent. I'd guess that Elastic can be happy that people are using their clients even if they are using zipkin

felixbarny commented 4 years ago

I'd really like an elastic engineer, perhaps @felixbarny to comment on why it is important to prevent bring-your-own unless using an agent.

I don't know the technical reason behind it, I would assume it's mainly a historical reason that predates Elastic APM. The ability to provide a custom HTTP client sounds like a good idea to me. I'd go and raise an issue at the Elasticsearch repo, that's where the client currently lives.

Ping @Mpdreamz, the tech lead of the client team in the issue who, I'm sure, has more context. We currently don't have a dedicated person for the ES Java client atm, but we have an open position. If you can agree on an approach we're more than happy to accept pull requests.

I'd guess that Elastic can be happy that people are using their clients even if they are using zipkin

For sure!

codefromthecrypt commented 4 years ago

perfect, Felix!

@gquintana can I leave it to you regarding follow up on the ES pull request. It looks straight forward, but if you need help, ping me.

On Sat, Jun 13, 2020, 4:37 PM Felix Barnsteiner notifications@github.com wrote:

I'd really like an elastic engineer, perhaps @felixbarny https://github.com/felixbarny to comment on why it is important to prevent bring-your-own unless using an agent.

I don't know the technical reason behind it, I would assume it's mainly a historical reason that predates Elastic APM. The ability to provide a custom HTTP client sounds like a good idea to me. I'd go and raise an issue at the Elasticsearch repo, that's where the client currently lives.

Ping @Mpdreamz https://github.com/Mpdreamz, the tech lead of the client team in the issue who, I'm sure, has more context. We currently don't have a dedicated person for the ES Java client atm, but we have an open position. If you can agree on an approach we're more than happy to accept pull requests.

I'd guess that Elastic can be happy that people are using their clients even if they are using zipkin

For sure!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/openzipkin/brave/issues/1226#issuecomment-643592010, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPVV3SRDVKHWC6J4SUKETRWM3GBANCNFSM4NYLDQEQ .

gquintana commented 4 years ago

Awesome. Thank you all