Closed simonis closed 6 days ago
The idea of the warump strategy is that it executes the code of the ClassFileTransformer in each expected way. Ideally in form of a transformed class, an ignored class and a class that fails the transformation.
This would normally result in all relevant classes that are in use on a particular VM to execute transformation. This would then include the VM classes used by Byte Buddy and the user.
The problem with listing the classes would be that transitive use would not be addressed? If I loaded ConcurrentHashMap, ThreadLocalRandom would still be initiated late?
I might however make an internal prerun of the circularity lock which is also built to avoid these problems.
I hopefully avoid this in the circularity lock now which hopefully loads all relevant classes during construction from now on.
Hi @raphw ,
The attempt to “warm-up” ClassCircularityLock.Default
will not solve the ClassCircularityError
risk for Byte Buddy users because ConcurrentHashMap::putIfAbsent
only loads ThreadLocalRandom
if it detects thread contention.
A quick experiment (on Java 17) shows that ThreadLocalRandom
is not loaded when the hash table is called with a single putIfAbsent
.
import java.util.concurrent.ConcurrentHashMap;
public class Main {
public static void main(String[] args) {
ConcurrentHashMap<String, String> map = new ConcurrentHashMap<>();
map.putIfAbsent("foo", "bar");
}
}
We see ConcurrentHashMap
and its inner classes loaded, but not ThreadLocalRandom
:
> java -Xlog:class+load=info -cp out Main | grep ConcurrentHashMap
[0.020s][info][class,load] java.util.concurrent.ConcurrentHashMap source: shared objects file
[0.021s][info][class,load] java.util.concurrent.ConcurrentHashMap$Node source: shared objects file
[0.025s][info][class,load] java.util.concurrent.ConcurrentHashMap$Segment source: shared objects file
[0.025s][info][class,load] java.util.concurrent.ConcurrentHashMap$CounterCell source: shared objects file
[0.025s][info][class,load] java.util.concurrent.ConcurrentHashMap$ReservationNode source: shared objects file
> java -Xlog:class+load=info -cp out Main | grep ThreadLocalRandom
>
However, when we use multiple threads:
import java.util.concurrent.ConcurrentHashMap;
public class Main {
public static void main(String[] args) {
ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();
for (int k = 0; k < 100; k++) {
new Thread(() -> {
for (int i = 0; i < 10000; i++) {
map.putIfAbsent(i, "bar");
}
}).start();
}
}
}
Then we do see ThreadLocalRandom
:
> java -Xlog:class+load=info -cp out Main | grep ThreadLocalRandom
[0.049s][info][class,load] java.util.concurrent.ThreadLocalRandom source: shared objects file
Note that merely looping 10000 times with just one thread didn't trigger the load of ThreadLocalRandom
.
I believe that the implementation of class circularity prevention (e.g. CircularityLock
) should be using a set of classes whose entire dependency closure can be reliably and cheaply preloaded (e.g. without the need to spin up multiple threads to trigger the load).
I explored the following approach:
The problem with this is that as of Java 17, I counted 4446 classes in the closure of java.lang.Object
, including classes like com.sun.crypto.provider.SunJCE
and java.util.regex.EmojiData
. Preloading all of them might be problematic for applications with stringent initialization time requirements; it takes circa 500-1000 ms on my laptop, single-threaded. To reduce this cost, closure computation could be done at the method level, not just the class level.
For example, if class C
has methods c1
and c2
, CircularityLock
uses C::c1
but not C::c2
, and c2
depends on class D
but not c1
, then the naïve class-based dependency scanner would put D
into the closure we need to preload, while a method-based scanner would not.
That being said, without a dependency scanner of some sort it's not possible for an instrumenting Java Agent to be truly safe from ClassCircularityError
.
Indeed, I did not dig into the code, but obviously not all relevant methods will be resolved.
It is however difficult to anticipate the dependency graph statically as the JVM can always change an implementation even in minor updates, and Byte Buddy cannot analyze all VM versions during a build. It would be to unreliable and I am looking for a dynamic solution.
I attempted a new solution now that uses code that protects the circularity lock with an additional global lock around the locking mechanism method. This might add some overhead, but class loading in itself is already an expensive operation, so I would not believe that it is notable, and it is certainly preferred over crashing the VM.
Hi @raphw,
We ran into
ClassCircularityError
s due to ec39754 which introduced the usage ofConcurrentHashMap
/ConcurrentHashMap::putIfAbsent()
in theAgentBuilder
class.ConcurrentHashMap::putIfAbsent()
can transitively trigger a call toConcurrentHashMap::fullAddCount()
which in turn triggers the resolution ofThreadLocalRandom
.If a transforming agent is registered and
ClassFileTransformer::transform()
gets called forThreadLocalRandom
before it was resolved inConcurrentHashMap
and thenClassFileTransformer::transform()
transitively callsConcurrentHashMap::fullAddCount()
through ByteBuddy, the resolution ofThreadLocalRandom
will fail with aClassCircularityError
. This can happen even if the agent doesn't really want to transformThreadLocalRandom
, but only uses ByteBuddy APIs to excludeThreadLocalRandom
or other classes from transformation.Notice that due to §5.4.3 "Resolution" of the Java Virtual Machine (JVM) Specification, this resolution failure will become a permanent failure on every future attempt to resolve the class for that specific constant pool entry:
This made it particularly hard to detect and attribute this error to the new ByteBuddy version which now uses
ConcurrentHashMap
, because:ConcurrentHashMap::fullAddCount()
doesn't get called deterministically by ByteBuddy. This only happens when theConcurrentHashMap
used by ByteBuddy is accessed concurrently due to parallel class loading which is inherently non-deterministic.ClassCircularityError
duringtransform()
is treated by default as iftransform()
returnsnull
(i.e. "no transformation performed") but due to §5.4.3 of the JVMS the sameClassCircularityError
can get thrown much later in unrelated user code which happens to use aConcurrentHashMap
concurrently.Other libraries like Data Dogs dd-trace-java or Glowroot ran into the same issue and solved it by preloading
ThreadLocalRandom
before registering aClassFileTransformer
.I'm aware of the discussions in #193: Explore options for warming up a class file transformer to avoid circularities, #859: How to exclude already loaded classes from transformation? and ByteBuddy's WarmupStrategy but to my understanding, this still requires the prior knowledge of which classes have to be preloaded.
So finally, my actual question is: do you think it is possible and useful that ByteBuddy provides a list of all the JDK/ASM classes it uses, or maybe even a feature which preloads all this classes before registering a transforming agent, in order to prevent such errors in the future?
Thank you and best regards, Volker