open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.9k stars 826 forks source link

Exception when using Spring Cloud Stream #7220

Open IcebergXTY opened 1 year ago

IcebergXTY commented 1 year ago

Describe the bug When an application uses spring cloud stream and spring aspect, there will be an error when collect metrics.

Steps to reproduce

  1. download the demo code
  2. change the rabbitmq address/username/password in application.yaml
  3. start the application with agent -javaagent:D:\DevelopSoftware\OpenTelemetry-agent\opentelemetry-javaagent.jar -Dotel.traces.exporter=logging -Dotel.metrics.exporter=logging -Dotel.logs.exporter=logging
  4. after application started, wait for about two minutes, there will throw an exception look like this
    2022-11-17 16:47:45.519  INFO 35332 --- [(4)-10.60.52.17] o.s.web.servlet.DispatcherServlet        : Completed initialization in 1 ms
    [otel.javaagent 2022-11-17 16:49:04:411 +0800] [PeriodicMetricReader-1] WARN io.opentelemetry.sdk.metrics.internal.state.CallbackRegistration - An exception occurred invoking callback for Instrument spring.integration.handlers.
    org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'luna-taskcore-exchange.taskcore.errors.handler': Initialization of bean failed; nested exception is java.lang.IllegalArgumentException: warning no match for this type name: com.example.demo.controller.CommonController [Xlint:invalidAbsoluteTypeName]
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:628)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)
    at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBeansOfType(DefaultListableBeanFactory.java:673)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBeansOfType(DefaultListableBeanFactory.java:661)
    at org.springframework.context.support.AbstractApplicationContext.getBeansOfType(AbstractApplicationContext.java:1300)
    at org.springframework.integration.config.IntegrationManagementConfigurer.lambda$registerComponentGauges$1(IntegrationManagementConfigurer.java:154)
    at io.opentelemetry.javaagent.shaded.instrumentation.micrometer.v1_5.DoubleMeasurementRecorder.accept(DoubleMeasurementRecorder.java:35)
    at io.opentelemetry.javaagent.shaded.instrumentation.micrometer.v1_5.DoubleMeasurementRecorder.accept(DoubleMeasurementRecorder.java:15)
    at io.opentelemetry.sdk.metrics.AbstractInstrumentBuilder.lambda$registerDoubleAsynchronousInstrument$0(AbstractInstrumentBuilder.java:87)
    at io.opentelemetry.sdk.metrics.internal.state.CallbackRegistration.invokeCallback(CallbackRegistration.java:97)
    at io.opentelemetry.sdk.metrics.internal.state.MeterSharedState.collectAll(MeterSharedState.java:96)
    at io.opentelemetry.sdk.metrics.SdkMeter.collectAll(SdkMeter.java:67)
    at io.opentelemetry.sdk.metrics.SdkMeterProvider$LeasedMetricProducer.collectAllMetrics(SdkMeterProvider.java:180)
    at io.opentelemetry.sdk.metrics.export.PeriodicMetricReader$Scheduled.doRun(PeriodicMetricReader.java:156)
    at io.opentelemetry.sdk.metrics.export.PeriodicMetricReader$Scheduled.run(PeriodicMetricReader.java:148)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.IllegalArgumentException: warning no match for this type name: com.example.demo.controller.CommonController [Xlint:invalidAbsoluteTypeName]
    at org.aspectj.weaver.tools.PointcutParser.parsePointcutExpression(PointcutParser.java:319)
    at org.springframework.aop.aspectj.AspectJExpressionPointcut.buildPointcutExpression(AspectJExpressionPointcut.java:227)
    at org.springframework.aop.aspectj.AspectJExpressionPointcut.obtainPointcutExpression(AspectJExpressionPointcut.java:198)
    at org.springframework.aop.aspectj.AspectJExpressionPointcut.getClassFilter(AspectJExpressionPointcut.java:177)
    at org.springframework.aop.support.AopUtils.canApply(AopUtils.java:226)
    at org.springframework.aop.support.AopUtils.canApply(AopUtils.java:289)
    at org.springframework.aop.support.AopUtils.findAdvisorsThatCanApply(AopUtils.java:321)
    at org.springframework.aop.framework.autoproxy.AbstractAdvisorAutoProxyCreator.findAdvisorsThatCanApply(AbstractAdvisorAutoProxyCreator.java:128)
    at org.springframework.aop.framework.autoproxy.AbstractAdvisorAutoProxyCreator.findEligibleAdvisors(AbstractAdvisorAutoProxyCreator.java:97)
    at org.springframework.aop.framework.autoproxy.AbstractAdvisorAutoProxyCreator.getAdvicesAndAdvisorsForBean(AbstractAdvisorAutoProxyCreator.java:78)
    at org.springframework.aop.framework.autoproxy.AbstractAutoProxyCreator.wrapIfNecessary(AbstractAutoProxyCreator.java:341)
    at org.springframework.aop.framework.autoproxy.AbstractAutoProxyCreator.postProcessAfterInitialization(AbstractAutoProxyCreator.java:293)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsAfterInitialization(AbstractAutowireCapableBeanFactory.java:455)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1808)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:620)
    ... 25 more

What did you expect to see? No exception throw.

What version are you using? Version 1.19.2

Environment Compiler: Eclipse Temurin 1.8.0_302 OS: windows 10 21H1 64bit

Additional context Anyway I think this problem is not caused by anyone, including spring and opentelemetry. It's just a coincidence. I will try to analyze the it here.


The error is reported at org.springframework.integration.config.IntegrationManagementConfigurer#registerComponentGauges, As shown here image This part of the logic is lazy loading, only when called will get and initialization of type org.springframework.messaging.MessageHandler. Here will get three beanName, The error is raised by luna-taskcore-exchange.taskcore.errors.handler. image When the luna-taskcore-exchange.taskcore.errors.handler bean is initializing, it's going to be processed by a BeanPostProcessor named org.springframework.aop.aspectj.annotation.AnnotationAwareAspectJAutoProxyCreator. The AnnotationAwareAspectJAutoProxyCreator will try all the aspects in org.springframework.aop.support.AopUtils#findAdvisorsThatCanApply method if it is suitable for the handler bean.

On the other hand, when spring try to load my com.example.demo.aspect.ExceptionAspect, first it will process the aspectj expression by org.springframework.aop.aspectj.AspectJExpressionPointcut#obtainPointcutExpression.Here, it will use the current thread's classloader to process the aspectj expression and this is AgentClassloader now.So when load my expression execution(* com.example.demo.controller.CommonController.*(..)), spring will raise an error as AgentClassloader don't know what CommonController is. image


The temporary solution is to load these bean in spring EventListener before the agent effect, but it's hard to know all the bean need to be early loaded. And as https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/7037 say, the agent actualy can't load the application class. So there is any other solution to solve this problem?

laurit commented 1 year ago

You could work around this by calling pointcut.setBeanFactory(...); in https://github.com/IcebergXTY/opentelemetry-cloud-stream-demo/blob/b52d74f2e317be1c5a97947a1bd187345c2fe68c/src/main/java/com/example/demo/aspect/DynamicPointcutAdvisor.java#L25 You can get bean factory from applicationContext.getAutowireCapableBeanFactory() I guess we need to remember context class loader when DoubleMeasurementRecorder is created and restore it in accept. @mateuszrzeszutek wdyt? An alternative way to fix it would we to clear context class loader from PeriodicMetricReader thread. When context class loader is null spring would use the class loader that loaded spring classes which probably would work. While this fixing the issue would be more of a coincidence not exposing agent class loader to user code seems reasonable. @jack-berg wdyt?

IcebergXTY commented 1 year ago

You can get bean factory from applicationContext.getAutowireCapableBeanFactory()

This works well, thanks

mateuszrzeszutek commented 1 year ago

I guess we need to remember context class loader when DoubleMeasurementRecorder is created and restore it in accept. @mateuszrzeszutek wdyt?

I suppose we could fix it that way, but...

An alternative way to fix it would we to clear context class loader from PeriodicMetricReader thread. When context class loader is null spring would use the class loader that loaded spring classes which probably would work. While this fixing the issue would be more of a coincidence not exposing agent class loader to user code seems reasonable. @jack-berg wdyt?

I think this issue might manifest even if OTel metrics API is used, even without spring - just loading an application class from within the callback function will probably blow up. The safest route we might take is remembering the context classloader used when creating each callback. We could probably do that just in the agent (micrometer shim & otel-api bridge), but I'm not sure whether this would fix the whole problem - I think it'd still be possible for the javaagent instrumentations (which don't use the bridges) to create async metric instruments that accidentally load classes.