elastic / apm-agent-java

https://www.elastic.co/guide/en/apm/agent/java/current/index.html
Apache License 2.0
566 stars 321 forks source link

Neo4j plugin #942

Open fgabolde opened 4 years ago

fgabolde commented 4 years ago

We use Neo4j for some use cases at work and adding basic support for tracing Cypher queries would help us debug performance issues in the services that use them.

A Neo4j plugin for tracing queries made through org.neo4j.driver:neo4j-java-driver seems like it wouldn't be out of scope for this project. As far as I can tell, most options for communicating with a Neo4j server use this driver behind the scenes (our spring-data repository certainly does, out of the box anyway).

I have done some preliminary work on this and I got a plugin to work with our APM (7.4). I can see the traces correctly reported with some metadata attached.

I'm willing to submit a PR (though the code still needs some work) and I have also gotten a verbal agreement from my CTO regarding the CLA. Is this something that would be of interest to you?

lreuven commented 4 years ago

Hi, Thanks for the feedback and supporting Neo4j plugin will be great and we will be happy to review it. some steps which will enable us to review it faster :

again, thanks for your contribution and let us know how we can assist.

fgabolde commented 4 years ago

It turns out there's a new series of drivers with a different API that my instrumentation probably won't work with.

spring-boot-starter-data-neo4j in Spring Boot 2.1.x depends on org.neo4j.driver:neo4j-java-driver:1.6.3 but the 2.2.x releases use org.neo4j.driver:neo4j-java-driver:4.0.0-beta02.

As far as I can tell the API is fairly different. The class I'm instrumenting for 1.x doesn't exist anymore in 4.0, does that mean that the current instrumentation is safe? I don't plan on instrumenting the new driver immediately as we are not using it right now, but to be safe I'll work on it sooner or later.

How do plugin maintainers manage multiple incompatible API versions?

SylvainJuge commented 4 years ago

It depends on how different the APIs are:

If classes are completely unrelated, for example in different packages, then nothing is required, newer versions will just be ignored as they do not match instrumentation plugin.

If classes are reusing old version API, which is frequent just to ensure easier migration, that's trickier. You can add checks for the presence of a certain class and make instrumentation only applied to a certain version. Binary compatibility can sometimes be an issue (for example an abstract class being replaced by an interface), but that's quite rare.

I'm not really familliar with neo4j driver, do you have any hints at what are the major API changes between those versions ?

Also, if I understand correctly, what you worry about is having an unexpected instrumentation with the newer driver. If that's the case, there is quick and cheap test that you can do: just start a simple application that uses this driver and check agent debug logs for any instrumented class.

fgabolde commented 4 years ago

If classes are completely unrelated, for example in different packages, then nothing is required, newer versions will just be ignored as they do not match instrumentation plugin.

Yes, that seems to be the case.

I'm not really familliar with neo4j driver, do you have any hints at what are the major API changes between those versions ?

Here's the changelog; mostly of interest for future users reading this issue, since it is not very specific as to the API changes. https://github.com/neo4j/neo4j-java-driver/wiki/4.0-changelog

The org.neo4j.driver.v1.StatementRunner interface that I was instrumenting doesn't exist anymore in the 4.0 branch (even its package has been removed, actually). I haven't had time to look much further.

1.x is bound to have active users for a good while so I guess this PR will not be completely useless. I may work on supporting the new driver if we upgrade some applications once it's officially released.

Also, if I understand correctly, what you worry about is having an unexpected instrumentation with the newer driver. If that's the case, there is quick and cheap test that you can do: just start a simple application that uses this driver and check agent debug logs for any instrumented class.

I'll do that, thanks for the tip.

fgabolde commented 4 years ago

@lreuven I'm not sure what you mean here:

* write integration tests with APM agent (akin to e.g. the application server ones), failing on some basic things not being captured

I took a look at these tests, they seem to handle more complex cases like runtime environments (Tomcat, Jetty etc.). I'm only instrumenting a new datastore, so I wrote some tests that stand up a database instance using testcontainers, ran some typical queries and checked that spans were being created. Do you think additional tests are required?

felixbarny commented 4 years ago

That is sufficient, no need to extend the integration tests unless there's no way to test via unit tests by extending AbstractInstrumentationTest.