Open Xitric opened 3 years ago
@Xitric SpecialAgent is not actively supported anymore. I would suggest to look at OpenTelemetry, specifically to https://github.com/open-telemetry/opentelemetry-java-instrumentation
Unfortunately, that is not an option since they dropped support for Java 7, and we specifically need to monitor a legacy platform. Do you have any other recommendations for alternatives to SpecialAgent? We would like to avoid vendor lock-in for the backend if at all possible.
no, I don't have any vendor-free recommendation. you can probably create manual opentracing instrumentation for your application using existing opentracing libraries (https://github.com/opentracing-contrib).
Hi @Xitric .
Would it help to instrument the OSGI classloader itself, so that it tries to load classes from bootstrap?
Maybe then this link below can serve as an inspiration, and if not existing in this repo you can just copy it to your fork?
Thanks for the tip @cptkng23.
I have actually already looked into how the OpenTelemetry javaagent and the Elastic javaagent handle OSGi classloaders. I have tried to inject similar code into the OSGi DefaultClassLoader#loadClass()
, but alas even this call throws a ClassNotFoundException
at runtime for all SpecialAgent classes related to rules. I have similarly tried to use other classloaders, such as delegating to the parent classloader or using the classloaders of other classes in SpecialAgent, but with similar results.
While I still have much reverse engineering and debugging to do, I am starting to form a theory on the issue. I believe that the base SpecialAgent classes that are available directly in the generated .jar
file are the only classes that I can access on the classloaders. The Rule classes, that are somehow made available to the classloaders at startup, are not available to my WebSphere applications. I have tried to provide these Rule classes via other means (by placing them in a separate, fat .jar
, which is loading by WebSphere at startup) just to see if it would work - and it does. Although I have only been able to load them by using the Thread.currentThread().getContextClassLoader()
, and this is only a temporary solution.
During startup of SpecialAgent, I noticed that the agent "injects" its Rule classes into instances of the OSGi DefaultClassLoader
. There are hundreds of entries such as this:
>>>>>>>> inject(org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader@-31b14e64)
I worry that the way IBM's classloaders work in WebSphere, these classloaders may not be readily available to my applications. But this is the part that I am still investigating. I think tomorrow I might try to inject some tracing logs into all classloaders to understand how they communicate at runtime, and perhaps pinpoint why the Rule classes are not visible in my WebSphere applications.
Any ideas are greatly appreciated!
Finally tracked down the root cause of my problems - it turns out I was looking in the wrong place. I was so confident the issue had to be with SpecialAgent, I did not realize that it was a bug in the JVM itself! However, it took a lot of digging around before this bug would eventually reveal itself.
Source: https://www.ibm.com/support/pages/apar/IV76963
Unfortunately, this bug was never fixed in the JVM that we are required to use, so I have implemented a workaround in my fork. Specifically, I had to extract the isThreadInstrumentable
inner class to its own class definition.
We have now been trying to get SpecialAgent working in a WebSphere deployment for the last couple days. We have identified a number of incompatibility issues, for which I will open a pull request once this is all working. However, we seem to have gotten stuck on some classloading issues, particularly in relation to the OSGi classloader used by the application server.
Here is some information on our environment:
Firstly, we noticed that
ByteBuddyManager.scanRules()
was failing when it came across a plugin for an unsupported version of Java. Since we use Java 7, and some plugins are created for Java 8, this prevented the successful loading of most plugins. Currently, it stops scanning as soon as a plugin fails loading, but we believe that it should rather skip that plugin and try loading the rest. And here is the change we made (original):We also noticed that our environment uses a
com.ibm.ws.bootstrap.ExtClassLoader
which can returnnull
values on the call toclassLoader.getURLs()
inClassLoaderMap
(link). We made this small change:With this in place, we can successfully achieve instrumentation of our servlets, but they fail at runtime due to the some classes not being available to the OSGi
BundleLoader
:We understand that this is likely because the OSGi classloader does not look for class definitions in the bootloader, where the SpecialAgent classes are injected. However, we have been unable to resolve this issue thus far. We even tried to add the opentracing packages to the OSGi bootdelegation, as recommended for other agents (AppDynamics):
However, this seems to have no effect, and the same exact error as above is logged. We are happy to help in resolving this issue, but at this point I think we require help in understanding the root cause of our issues.
Sorry for the long writeup, but I wanted to provide you with as much information about our process as I could. I look forward to hearing from you @malafeev @safris, or someone else!