Open IcebergXTY opened 1 year ago
You could work around this by calling pointcut.setBeanFactory(...);
in https://github.com/IcebergXTY/opentelemetry-cloud-stream-demo/blob/b52d74f2e317be1c5a97947a1bd187345c2fe68c/src/main/java/com/example/demo/aspect/DynamicPointcutAdvisor.java#L25 You can get bean factory from applicationContext.getAutowireCapableBeanFactory()
I guess we need to remember context class loader when DoubleMeasurementRecorder
is created and restore it in accept. @mateuszrzeszutek wdyt?
An alternative way to fix it would we to clear context class loader from PeriodicMetricReader
thread. When context class loader is null spring would use the class loader that loaded spring classes which probably would work. While this fixing the issue would be more of a coincidence not exposing agent class loader to user code seems reasonable. @jack-berg wdyt?
You can get bean factory from applicationContext.getAutowireCapableBeanFactory()
This works well, thanks
I guess we need to remember context class loader when
DoubleMeasurementRecorder
is created and restore it in accept. @mateuszrzeszutek wdyt?
I suppose we could fix it that way, but...
An alternative way to fix it would we to clear context class loader from
PeriodicMetricReader
thread. When context class loader is null spring would use the class loader that loaded spring classes which probably would work. While this fixing the issue would be more of a coincidence not exposing agent class loader to user code seems reasonable. @jack-berg wdyt?
I think this issue might manifest even if OTel metrics API is used, even without spring - just loading an application class from within the callback function will probably blow up. The safest route we might take is remembering the context classloader used when creating each callback. We could probably do that just in the agent (micrometer shim & otel-api bridge), but I'm not sure whether this would fix the whole problem - I think it'd still be possible for the javaagent instrumentations (which don't use the bridges) to create async metric instruments that accidentally load classes.
Describe the bug When an application uses spring cloud stream and spring aspect, there will be an error when collect metrics.
Steps to reproduce
What did you expect to see? No exception throw.
What version are you using? Version 1.19.2
Environment Compiler: Eclipse Temurin 1.8.0_302 OS: windows 10 21H1 64bit
Additional context Anyway I think this problem is not caused by anyone, including spring and opentelemetry. It's just a coincidence. I will try to analyze the it here.
The error is reported at
org.springframework.integration.config.IntegrationManagementConfigurer#registerComponentGauges
, As shown here This part of the logic is lazy loading, only when called will get and initialization of typeorg.springframework.messaging.MessageHandler
. Here will get three beanName, The error is raised byluna-taskcore-exchange.taskcore.errors.handler
. When theluna-taskcore-exchange.taskcore.errors.handler
bean is initializing, it's going to be processed by a BeanPostProcessor namedorg.springframework.aop.aspectj.annotation.AnnotationAwareAspectJAutoProxyCreator
. TheAnnotationAwareAspectJAutoProxyCreator
will try all the aspects inorg.springframework.aop.support.AopUtils#findAdvisorsThatCanApply
method if it is suitable for the handler bean.On the other hand, when spring try to load my
com.example.demo.aspect.ExceptionAspect
, first it will process the aspectj expression byorg.springframework.aop.aspectj.AspectJExpressionPointcut#obtainPointcutExpression
.Here, it will use the current thread's classloader to process the aspectj expression and this is AgentClassloader now.So when load my expressionexecution(* com.example.demo.controller.CommonController.*(..))
, spring will raise an error as AgentClassloader don't know what CommonController is.The temporary solution is to load these bean in spring EventListener before the agent effect, but it's hard to know all the bean need to be early loaded. And as https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/7037 say, the agent actualy can't load the application class. So there is any other solution to solve this problem?