Open gabrielgiussi opened 2 years ago
We have some runtime logic that conditionally used io.grpc.ManagedChannel
if the user calls OtlpGrpcMetricExporterBuilder#setChannel(ManagedChannel)
. In java this works fine without including grpc-stub
on the classpath as demonstrated in the default test suite of :exporters:otlp:all
. This behavior relies on pretty careful references to io.grpc.ManagedChannel
such that the class is only loaded when the user references it. That's the theory anyway.
I was able to recreate this error locally. Groovy class loading appears to work differently - the groovy runtime is more aggressive about loading classes than java.
Open to ideas on how we can fix this. I can think of some ugly ideas (like providing your own empty implementation of io.grpc.ManagedChannel
) but don't know groovy internals well and don't know the "right" way to fix this.
@jack-berg good to know that the groovy classloader breaks things...but this question is about clojure. Perhaps it also does some classloading shenanigans?
🤦 ah..
Perhaps it also does some classloading shenanigans
Yes, when it parses the expression (.build (OtlpGrpcMetricExporter/builder))
it asks for the methods of the class to find if there is any called build
with no arguments so it can call that method. This part of the stacktrace is doing that
[java.lang.Class getMethods "Class.java" 1904]
[clojure.lang.Reflector getMethods "Reflector.java" 498]
[clojure.lang.Compiler$HostExpr$Parser parse "Compiler.java" 979]
Once we call getMethods then it is already java trying to load the class io.grpc.ManagedChannel
since one of the method setChannel
receives that.
I see the method is deprecated, do you have plans to remove it entirely in the near future? I think we could do some tricks like including our own implementation of the class meanwhile but IMHO we shouldn't be doing any special class loading to avoid this error.
It's worth noting, we rely on the specifics of java class loading in several places in the autoconfiguration module as well, so this won't be the only place where closure is impacted.
I'm not sure we can do anything about this. Maybe if we knew more about closure's mechanics for loadings classes we could do something like change OtlpGrpcMetricExporter.builder(ManagedChannel)
to accept an Object
instead of ManagedChannel
. But the internals will still eventually need to cast to ManagedChannel
, which might still trigger closure to try to load the class. But even if that worked for closure, it would have to wait until a 2.x.x version since it's not an allowable change under our backwards compatibility guarantees.
Is there anything we could do from a classloader perspective to provide a fallback implementation of io.grpc.ManagedChannel
is none is available?
we could do something like change OtlpGrpcMetricExporter.builder(ManagedChannel) to accept an Object instead of ManagedChannel. But the internals will still eventually need to cast to ManagedChannel, which might still trigger closure to try to load the class
I think that will work, because I'm never explicitly calling the setChannel
so clojure won't try to load the class, and correct me if I'm wrong but I understand that method is not called internally or indirectly, otherwise having that dependency in the classpath shouldn't be optional. In other words can we be sure that if I'm not calling the setChannel method explicitly then that method won't be called at all? If that's the case then making setChannel
accept an Object
instead of a ManagedChannel
will prevent this exception.
I'm never explicitly calling the setChannel so clojure won't try to load the class
Is there a reference available that describes the mechanics around when / how clojure decides to load a class? We already don't call setChannel
internally ever. The code path you're using only has unused references to ManagedChannel
(i.e. imports in the class used in methods that are not called). Not sure that switching to an Object
would solve the problem.
Even if it did work, the next sentence from the quote is important:
even if that worked for closure, it would have to wait until a 2.x.x version since it's not an allowable change under our backwards compatibility guarantees.
I think Clojure is just calling .getMethods()
on the class object for the OtlpGrpcMetricExporter
class, and Java reflection is loading the classes for the types referenced in method signatures. If this is correct, any other tool that uses reflection to introspect this class would trigger the problem.
If no member of the class references the lazy-loadeable class (which I think would be the goal of changing the setChannel
signature to take an Object), it should be fine, even some of the implementation depends on that class. At least, reflection users wouldn't be affected, I think.
@jack-berg
even if that worked for closure, it would have to wait until a 2.x.x version since it's not an allowable change under our backwards compatibility guarantees
In that case won't be better just get rid of the method since is marked as deprecated?
I think Clojure is just calling .getMethods() on the class object for the OtlpGrpcMetricExporter class, and Java reflection is loading the classes for the types referenced in method signatures. If this is correct, any other tool that uses reflection to introspect this class would trigger the problem.
If this is indeed the case, I don't think we can solve this issue with changes in opentelemetry-java
. We rely on a combination of compileOnly
dependencies and java's class loading semantics to drive behavior which is conditional on the classpath. Supporting reflective access opens the door to interacting with the library in all sorts of unpredictable ways that we intentionally don't expose in the public APIs.
In that case won't be better just get rid of the method since is marked as deprecated?
Well there's an outstanding issue requesting that we un-deprecate the setChannel
method. If the arguments in that issue are resolved by the time a 2.x.x we could remove it.
We rely on a combination of compileOnly dependencies and java's class loading semantics to drive behavior which is conditional on the classpath.
Since we are using opentelemetry-java in an experimental project we decided to add the grpc-api (*) dependency so the ManagedChannel class is in the classpath and the exception is not raised. But from this comment I'm not sure if doing so is doing more than just preventing the exception but also changing some behavior in a way we as users can't anticipate. In other words
(*) BTW where can I find which is the right version of this lib for a specific version of opentelemetry-sdk
, e.g. 1.19.0
Is the otel library doing something different just because the ManagedChannel is in the classpath?
No. It was in the past, but as of now the upstream grpc implementation is only used if setChannel
is called.
Is this strategy applied with other dependencies as well? I'm asking this to understand if including the dependency in our side as a solution will scale or we may have to add others as well (so far adding only grpc-api was enough).
Here is the set of all the examples where logic is dependent on the classpath:
DoubleAdder
/ LongAdder
on the classpath, which perform better than atomics when there is high contention, but which aren't available in certain versions of android. If they aren't available, we use a lower performance implementation based on AtomicLong
. ArrayBlockingQueue
if jdk.unsupported
is not included in the JRE distribution.AbstractStub
is on the classpath. Code here. GlobalOpenTelemetry.get()
will trigger autoconfiguration if it is detected on the classpath. Code here. Not that besides the setChanel(ManagedChannel)
these are all internal implementation details not exposed in public APIs.
(*) BTW where can I find which is the right version of this lib for a specific version of opentelemetry-sdk, e.g. 1.19.0
Version doesn't matter since the code isn't actually being used, but we track all our dependency versions here. Current version of grpc is 1.50.2
.
Thanks for the detailed response 👍
I'm also getting "java.lang.ClassNotFoundException: io.grpc.ManagedChannel", I am trying to create a "jenkins" shared library but even that I am using "OtlpGrpcMetricExporter.builder().setEndpoint" still there is an exception that prevent me to run can we have a fix or a way to run?
Describe the bug A ClassNotFoundException is thrown trying to build the
OtlpGrpcMetricExporter
in clojure.I don't have the grcp lib in the classpath but this is working in a java project. I saw this message in the changelog for the version 1.17.0
So I understand it is ok to not have the grpc dependency and I do have
okhttp:4.10.0
in the classpath as a transitive dependency.Steps to reproduce Loading a namespace into the repl with this snippet already triggers the exception
What version and what artifacts are you using? Artifacts: opentelemetry-api, opentelemetry-sdk, opentelemetry-exporter-otlp Version: 1.19.0
Environment Compiler: AdoptOpenJDK 11.0.11
Additional context If I include the
grpc-stub
dependency then it is working fine but I don't know which version should I use or if this is the expected way to solve it.