classgraph / classgraph

An uber-fast parallelized Java classpath scanner and module scanner.
MIT License
2.78k stars 289 forks source link

ConcurrentModificationException while getting ScanResult output #257

Closed pshirshov closed 6 years ago

pshirshov commented 6 years ago

Sometimes this happens:

java.util.ConcurrentModificationException
    at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:937)
    at java.base/java.util.ArrayList$Itr.next(ArrayList.java:891)
    at io.github.classgraph.ScanResult.getResourcesMatchingPattern(ScanResult.java:383)

My call looks innocent enough:

val configUrls = new ClassGraph().scan()
    .getResourcesMatchingPattern(configRegex.r.pattern)
    .asScala
    .map(_.getURL)

Is it possible to fix it?

lukehutch commented 6 years ago

As far as I can tell, this can't be a bug in ClassGraph, because ScanResult.allResources (which is what the offending line is iterating through) is only initialized once, in the ScanResult constructor. Is it possible you have some code in another thread that calls getAllResources(), then modifies the returned ResourceList?

pshirshov commented 6 years ago

Nope. This code is always being executed in a single thread and we never modify the list.

Though ConcurrentModificationException doesn't neccessary mean threading issues, it may be thrown if you modify a collection while iterating on it. But we don't modify, just read.

lukehutch commented 6 years ago

What am I missing? Please take a look at the code:

https://github.com/classgraph/classgraph/blob/master/src/main/java/io/github/classgraph/ScanResult.java#L383

The code never modifies allResources.

lukehutch commented 6 years ago

Are you using the latest version? Maybe I'm looking at the wrong line number.

pshirshov commented 6 years ago

Yup, latest. Actually it may be a sideffect, so I've tried to update my code to evaluate target collection in strict manner. In case doesn't help I'll report here.

lukehutch commented 6 years ago

A side effect of?

lukehutch commented 6 years ago

Though ConcurrentModificationException doesn't neccessary mean threading issues, it may be thrown if you modify a collection while iterating on it. But we don't modify, just read.

ClassGraph code doesn't modify the collection either, and based on the code example you gave, the only explanation would be interference by another thread. Or maybe asScala does something funky? That wouldn't be expected though...

pshirshov commented 6 years ago

So, asScala itself isn't strict but I can't see how may it break things.

Is ClassGraph API itself threadsafe? We don't use the collection across several threads, though we may invoke several scan()s concurrently.

lukehutch commented 6 years ago

So, asScala itself isn't strict but I can't see how may it break things.

How so? What exactly does it do under the hood?

Is ClassGraph API itself threadsafe? We don't use the collection across several threads, though we may invoke several scan()s concurrently.

You should create one ClassGraph instance per call to scan(), and the scan should be launched from a single thread, since it is itself multithreaded. Technically you could reuse a ClassGraph instance to launch another scan, but most of the configuration options are designed for one-off configuration, and the ClassGraph class itself was not designed with multithreaded reuse in mind.

The ScanResult itself should be threadsafe.

But in particular, just take a look at the source code for ScanResult that I linked, based on the stacktrace you posted. There is simply no modification of the collection in question in the ScanResult code, after the constructor has run, or in particular in the method in the stack trace (getResourcesMatchingPattern(ScanResult.java:383)). This is why I said that I don't see any way that this bug could be in ClassGraph itself. I have never seen a ConcurrentModificationException in any runs of any unit tests either, and there's relatively good coverage of the API among the unit tests.

pshirshov commented 6 years ago

How so? What exactly does it do under the hood?

In case you feed it with an iterable it would return you iterable, so asScala doesn't trigger immediate evaluation. But it should be a problem in our case.

You should create one ClassGraph instance per call to scan()

That's exactly what we do.

just take a look at the source code for ScanResult

I did. Seems to be safe. Okay, I will try to dig deeper and check if my idea with enforcing evaluation helped or not.

lukehutch commented 6 years ago

Ah! I think I see one way this may be happening. I thought that ScanResult.allResources was never modified after the constructor was run, but I missed one reference.

ScanResult#close() contains the following code:

        if (!isClosed) {
            isClosed = true;
            if (allResources != null) {
                for (final Resource classpathResource : allResources) {
                    classpathResource.close();
                }
                allResources.clear();     // (*)
                pathToResourceList.clear();
            }
            // [...]
        }

The line marked (*) is probably what is causing the ConcurrentModificationException, due to a race condition in your code, with one thread calling ScanResult#close(), and another thread continuing to use the ScanResult instance after it is closed. It would have to be a race condition between two threads, because ScanResult#getResourcesMatchingPattern(final Pattern pattern) checks if (isClosed) on entry, and throws an exception if #close() has already been called:

    public ResourceList getResourcesMatchingPattern(final Pattern pattern) {
        if (isClosed) {
            throw new IllegalArgumentException("Cannot use a ScanResult after it has been closed");
        }
        if (allResources == null || allResources.isEmpty()) {
            return new ResourceList(1);
        } else {
            final ResourceList filteredResources = new ResourceList();
            for (final Resource classpathResource : allResources) {
                final String relativePath = classpathResource.getPath();
                if (pattern.matcher(relativePath).matches()) {
                    filteredResources.add(classpathResource);
                }
            }
            return filteredResources;
        }
    }

Therefore, the thread that closes the ScanResult would have to close it while the second thread is iterating through allResources (i.e. after the isClosed check).

The close() method is the only place where ClassGraph modifies allResources after the ScanResult constructor exits. So there are only two possible explanations: (1) the race condition described above, or (2) your own code calling ScanResult#getAllResources() and then subsequently modifying the resulting list in one thread while the thread you describe is trying to iterate through that same list by calling getResourcesMatchingPattern(Pattern).

lukehutch commented 6 years ago

Let me know if it looks like the race condition is at fault. If so, you should fix the code so that the ScanResult is never accessed after calling ScanResult#close(), if you're manually closing it.

pshirshov commented 6 years ago

It would have to be a race condition between two threads,

Shouldn't be. There is strictly one thread operating on the result. Moreover we don't even manage the threads, this is happening within our macro.

And the result of computation never leaves local scope, so it physically can't be shared across threads.

This is the old chunk of our code:

def smth() = {
...
val configUrls = new ClassGraph().scan()
    .getResourcesMatchingPattern(configRegex.r.pattern)
    .asScala
    .map(_.getURL)
}
      val referenceConfig = configUrls.foldLeft(ConfigFactory.empty())(_ withFallback ConfigFactory.parseURL(_)).resolve()

      Some(new ConfigModule(AppConfig(referenceConfig)))
}

So, the method may be invoked by several threads but nothing is shared and the object, leaving the method is strict.

This chunk of code isn't correct - the resource was never closed. Though it shouldn't lead to CME by any chance in case you don't call .close() from finalize

Though as you can see the object is not being closed here so I'm wondering if it may be some finalization logic+escape analysis. Though from what I can remember in case you didn't invoke close - noone would do that for you.

lukehutch commented 6 years ago

OK, there's one more possible scenario then -- it is a threading issue, but not your threads. The other thread is the garbage collector.

The documentation says several times that you should allocate the ScanResult (the result of ClassGraph#scan() in a try-with-resources block, or to close the result manually in a finally block in Scala (since Scala doesn't yet have try-with-resources). You are not doing that in your code. The reasons given for using ScanResult in try-with-resources are so that there aren't any unpredictably large GC pauses when the garbage collector calls ScanResult#close() (since close() is also called by ScanResult#finalize()).

What must be happening is that Scala must take the idea of "tail recursion" / "tail calls" to the limit, and after new ClassGraph().scan() is called, returning the ScanResult, as soon as .getResourcesMatchingPattern(configRegex.r.pattern) is called, Scala realizes there are no more things that reference the ScanResult, and the ScanResult is scheduled for garbage collection. The garbage collection takes long enough to start that the if (isClosed) check in getResourcesMatchingPattern(Pattern) never fails. But then the ScanResult is garbage collected, triggering the finalizer to run, and ScanResult#close() gets called. This causes the allResources.clear() line to be run while the for (final Resource classpathResource : allResources) { } loop is still running. That's a pretty complicated chain of events!

So the fix for your code should be to simply do this, as recommended in the docs (albeit for a different reason):

val scanResult = new ClassGraph().scan()
try {
    val configUrls = scanResult.getResourcesMatchingPattern(configRegex.r.pattern)
    .asScala
    .map(_.getURL)
    // ... use configURLs
} finally {
    scanResult.close()
}

Please confirm if this fixes the problem for you.

However, since I suspect that this is not the last time this will come up, I will change the close() method so that it does not clear the internal collections. This creates a bit more work for the garbage collector, but it's better to solve this for cases where users forget to use try-with-resources or manually close the ScanResult in a finally block.

lukehutch commented 6 years ago

And actually, if manually closing the scanResult in a finally block does solve the issue for you, this is a very significant bug in Scala, since Scala should not be calling a finalizer for an object when methods of the object are still running! This would be a very interesting finding. (I know finalizers are considered "broken by design", but Scala still needs to honor their API contract.)

lukehutch commented 6 years ago

I pushed out a fix as version 4.2.9. Please test for the following two scenarios (both of these should fix the problem):

  1. Version 4.2.8 with the scanResult.close() in the finally block
  2. Version 4.2.9 without a finally block (i.e. your original code)
pshirshov commented 6 years ago

Ahhh, so you call close() from finalizer. In that case it explains the issue, it's escape analysis+this.

I've rewritten that chunk of code yesterday with try/finally, so it should fix the issue.

is a very significant bug in Scala, since Scala should not be calling a finalizer for an object when methods of the object are still running!

Scala has nothing to deal with it. I bet it's escape analysis.

I will check how would it work with try/finally. Till the moment there were no failures, so I'm almost sure it's fixed.

lukehutch commented 6 years ago

Ahhh, so you call close() from finalizer.

Yes, by design. It is also called by a shutdown hook, to be extra sure it is cleaned up. ClassGraph can create lots of temporary files in cases where you scan lots of inner jars (jars within jars), since they have to be extracted to be scanned.

See here:

https://github.com/classgraph/classgraph/wiki/API:-ScanResult#scanresult-lifecycle

Scala has nothing to deal with it. I bet it's escape analysis.

That's what I'm saying -- that Scala's escape analysis must be broken (or incomplete). Scala should have held a reference to the ScanResult against garbage collection when it knows that it just barely invoked a method on the very object that was scheduled for garbage collection.

In other words, there should be one reference to the invocation target held on the stack or in the heap during a method call, to guarantee that the invocation target cannot be garbage collected while the object's own methods are being executed. When the method returns, the invocation target reference can be released.

There is no way that the invocation target of a running method should ever be allowed to be garbage collected! If Scala is allowing this, it is broken.

pshirshov commented 6 years ago

That's what I'm saying -- that Scala's escape analysis must be broken (or incomplete).

This is JVM itself. Scala has to do nothing with GC policies/etc. -XX:+DoEscapeAnalysis. We always use this option and I think this is the first time we have any issues with it (though we've recently switched to jdk11, maybe they have more agressive policies).

Yes, the reference is still there, within asScala adaper. So this looks like an EA problem.

lukehutch commented 6 years ago

Scala has to do nothing with GC policies/etc.

But Scala could fix this behavior -- regardless of whether it is caused by a JVM runtime option or not -- by keeping around a reference to the invocation target, right? Or not? I'm pretty sure Java survives this, due to holding stack frames as long as possible (Scala is smarter about dropping references immediately after last usage).

I believe you would find that if you translated your code snippet directly into Java, you would not see the problem on 4.2.8.

-XX:+DoEscapeAnalysis

I see, I didn't know this was a JVM option. Usually compilers do their own escape analysis, not the VM. Why do you enable this? Is it for memory conservation or something?

Please see the two numbered testing requests a few comments back, in case you missed that. Thanks!

pshirshov commented 6 years ago

But Scala could fix this behavior -- regardless of whether it is caused by a JVM runtime option or not -- by keeping around a reference to the invocation target, right?

There is a reference, within adapter object.

  trait IterableWrapperTrait[A] extends ju.AbstractCollection[A] {
    val underlying: Iterable[A]
    def size = underlying.size
    override def iterator = IteratorWrapper(underlying.iterator)
    override def isEmpty = underlying.isEmpty
  }

Don't know why jvm misses it.

Why do you enable this? Is it for memory conservation or something?

This is a standard option for situations where you produce a lot of temporary local objects. It reduces GC time bcs these objects may be freed just in time.

Anyway, thanks for the help.

Please see the two numbered testing requests a few comments back, in case you missed that. Thanks!

Yup, will check.

lukehutch commented 6 years ago

OK, then I agree with you now that this is a JVM problem :-D Sorry!

It would be interesting to try to replicate this behavior with a simple case, e.g.:

    public static class ClassWithFinalizer {      
        private boolean finalized = false;

        public void longRunningMethod() {
            for (int i = 0; i < 10; i++) {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                }
                System.out.println(i + " : " + finalized);
            }
        }

        @Override
        protected void finalize() throws Throwable {
            finalized = true;
        }
    }

Try calling new ClassWithFinalizer().longRunningMethod() and see if you can replicate the brokenness in your same JVM environment with the same commandline options. (The code should never output "true".)

lukehutch commented 6 years ago

This is a standard option for situations where you produce a lot of temporary local objects. It reduces GC time bcs these objects may be freed just in time.

Ah, I was wondering how the GC was getting notified so quickly that the invocation target was not used in the rest of the block. Usually (at least in Java), the GC doesn't even check if a reference is no longer needed until the containing scope is popped off the stack.

lukehutch commented 6 years ago

Don't know why jvm misses it.

Aleksey Shipilëv would be very interested to know about this situation, probably, if it turns out that enabling escape analysis does in fact cause finalizers to run while methods in those objects are still running. That's very serious!

lukehutch commented 6 years ago

I mentioned:

Usually compilers do their own escape analysis, not the VM.

From this article by Aleksey: https://shipilev.net/jvm-anatomy-park/18-scalar-replacement/

But what really happens is that the compiler performs the so called Escape Analysis (EA), which can identify which newly created objects are not escaping into the heap, and then it can do a few interesting optimizations. Note that EA itself is not the optimization, it is the analysis phase that gives important pieces of data for the optimizer.

However, I did find one interesting fact: any class that overrides finalize() is supposed to be marked as a GlobalEscape class, which is forced to be allocated on the heap, so post-escape-analysis optimizations cannot be applied to that class. This is the case with ScanResult, so using the -XX:+DoEscapeAnalysis should have no effect whatsoever on ScanResult. Whatever is supposed to mark ScanResult as a GlobalEscape class (because it overrides finalize()) is failing.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.3797&rep=rep1&type=pdf (top of page 11)

As far as I can tell, it is the compiler's responsibility to determine if ScanResult overrides finalize() when a new ScanResult is created. But I haven't yet figured out how the compiler is supposed to notify the JVM as to whether a given object reference is supposed to be marked as heap-only, since the classfile format spec doesn't have any mention of escape analysis.

lukehutch commented 6 years ago

I tried replicating this behavior in Scala and Java, and could not trigger it using .getResourcesMatchingPattern(Pattern) using -XX:+DoEscapeAnalysis in JDK 11.

I did manage to replicate the behavior in ClassGraph-4.2.8 using .getAllResources(), after that method had returned, but that ConcurrentModificationException actually makes sense, because the code in .getAllResources() had already exited when the finalizer was run. (In your case, the .getResourcesMatchingPattern(Pattern) method is still running when the finalizer is run.)

ClassGraph-4.2.9 fixes the ConcurrentModificationException that I was able to trigger, so it should fix your case too, but I would still be interested in figuring out how your runtime environment ended up running the finalizer early.

pshirshov commented 6 years ago

This was a very rare case. I will try to get back with more info.

pshirshov commented 6 years ago

So, this snippet breaks with 100% probability:

import io.github.classgraph.{ClassGraph, ClassInfo}
import java.lang.management.ManagementFactory
import scala.collection.JavaConverters._

println(ClassGraph.getVersion)
println(System.getProperties.asScala.filterKeys(_.startsWith("java.")).mkString("\n"))
println(ManagementFactory.getRuntimeMXBean.getInputArguments)

while (true) {
  new ClassGraph().scan()
    .getResourcesMatchingPattern(".*\\.conf".r.pattern)
    .asScala
    .map(_.getURL)
}

I'm just executing it in IDEA repl.

The output:

``` import io.github.classgraph.{ClassGraph, ClassInfo} import java.lang.management.ManagementFactory import scala.collection.JavaConverters._ 4.2.6 java.specification.version -> 11 java.class.path -> /Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/jps-builders.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/util.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/trove4j.jar:/Users/liyon/Library/Application Support/IntelliJIdea2018.3/Scala/lib/scala-reflect.jar:/Users/liyon/Library/Application Support/IntelliJIdea2018.3/Scala/lib/jps/sbt-interface.jar java.vm.vendor -> Oracle Corporation java.vendor.url -> http://java.oracle.com/ java.vm.specification.version -> 11 java.specification.vendor -> Oracle Corporation java.version.date -> 2018-09-25 java.home -> /Library/Java/JavaVirtualMachines/jdk-11.jdk/Contents/Home java.vm.compressedOopsMode -> Zero based java.specification.name -> Java Platform API Specification java.vm.specification.vendor -> Oracle Corporation java.awt.graphicsenv -> sun.awt.CGraphicsEnvironment java.runtime.version -> 11+28 java.runtime.name -> Java(TM) SE Runtime Environment java.vm.name -> Java HotSpot(TM) 64-Bit Server VM java.vendor.version -> 18.9 java.vendor.url.bug -> http://bugreport.java.com/bugreport/ java.io.tmpdir -> /var/folders/1g/t908g0zs44n01tykdw8sq7l40000gn/T/ java.version -> 11 java.vm.specification.name -> Java Virtual Machine Specification java.awt.printerjob -> sun.lwawt.macosx.CPrinterJob java.library.path -> /Users/liyon/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. java.vendor -> Oracle Corporation java.vm.info -> mixed mode java.vm.version -> 11+28 java.class.version -> 55.0 [-Xmx1024m, -Xss1m] java.util.ConcurrentModificationException at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042) at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996) at io.github.classgraph.ScanResult.getResourcesMatchingPattern(ScanResult.java:383) ... 40 elided ```

So, no threading, no even escape analysis flat, nothing.

pshirshov commented 6 years ago

So, modified snippet to get the full trace:

import io.github.classgraph.{ClassGraph, ClassInfo}
import java.lang.management.ManagementFactory

import scala.collection.JavaConverters._
import scala.compat.Platform.ConcurrentModificationException

println(ClassGraph.getVersion)
println(System.getProperties.asScala.filterKeys(_.startsWith("java.")).mkString("\n"))
println(ManagementFactory.getRuntimeMXBean.getInputArguments)

while (true) {
  try {
    new ClassGraph().scan()
      .getResourcesMatchingPattern(".*\\.conf".r.pattern)
      .asScala
      .map(_.getURL)
  } catch {
    case cme: ConcurrentModificationException =>
      cme.printStackTrace()
      throw cme
  }
}
output ``` import io.github.classgraph.{ClassGraph, ClassInfo} import java.lang.management.ManagementFactory import scala.collection.JavaConverters._ import scala.compat.Platform.ConcurrentModificationException 4.2.6 java.specification.version -> 11 java.class.path -> /Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/jps-builders.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/util.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/trove4j.jar:/Users/liyon/Library/Application Support/IntelliJIdea2018.3/Scala/lib/scala-reflect.jar:/Users/liyon/Library/Application Support/IntelliJIdea2018.3/Scala/lib/jps/sbt-interface.jar java.vm.vendor -> Oracle Corporation java.vendor.url -> http://java.oracle.com/ java.vm.specification.version -> 11 java.specification.vendor -> Oracle Corporation java.version.date -> 2018-09-25 java.home -> /Library/Java/JavaVirtualMachines/jdk-11.jdk/Contents/Home java.vm.compressedOopsMode -> Zero based java.specification.name -> Java Platform API Specification java.vm.specification.vendor -> Oracle Corporation java.awt.graphicsenv -> sun.awt.CGraphicsEnvironment java.runtime.version -> 11+28 java.runtime.name -> Java(TM) SE Runtime Environment java.vm.name -> Java HotSpot(TM) 64-Bit Server VM java.vendor.version -> 18.9 java.vendor.url.bug -> http://bugreport.java.com/bugreport/ java.io.tmpdir -> /var/folders/1g/t908g0zs44n01tykdw8sq7l40000gn/T/ java.version -> 11 java.vm.specification.name -> Java Virtual Machine Specification java.awt.printerjob -> sun.lwawt.macosx.CPrinterJob java.library.path -> /Users/liyon/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. java.vendor -> Oracle Corporation java.vm.info -> mixed mode java.vm.version -> 11+28 java.class.version -> 55.0 [-Xmx1024m, -Xss1m] java.util.ConcurrentModificationException at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042) at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996) at io.github.classgraph.ScanResult.getResourcesMatchingPattern(ScanResult.java:383) at $line28.$read$$iw$$iw$$iw$$iw$.liftedTree1$1(:21) at $line28.$read$$iw$$iw$$iw$$iw$.(:19) at $line28.$read$$iw$$iw$$iw$$iw$.() at $line28.$eval$.$print$lzycompute(:7) at $line28.$eval$.$print(:6) at $line28.$eval.$print() at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:735) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:999) at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:567) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:34) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:30) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:33) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:566) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:593) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:563) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperImpl.processChunk(ILoopWrapperImpl.scala:41) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactory.loadReplWrapperAndRun(ILoopWrapperFactory.scala:93) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactory.loadReplWrapperAndRun(ILoopWrapperFactory.scala:35) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.$anonfun$loadReplWrapperAndRun$3(ILoopWrapperFactoryHandler.scala:57) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler$.org$jetbrains$jps$incremental$scala$local$worksheet$ILoopWrapperFactoryHandler$$withFilteredPath(ILoopWrapperFactoryHandler.scala:124) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.$anonfun$loadReplWrapperAndRun$2(ILoopWrapperFactoryHandler.scala:54) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.$anonfun$loadReplWrapperAndRun$2$adapted(ILoopWrapperFactoryHandler.scala:43) at scala.Option.foreach(Option.scala:257) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.loadReplWrapperAndRun(ILoopWrapperFactoryHandler.scala:43) at org.jetbrains.jps.incremental.scala.local.worksheet.WorksheetServer.loadAndRun(WorksheetServer.scala:29) at org.jetbrains.jps.incremental.scala.remote.Main$.make(Main.scala:85) at org.jetbrains.jps.incremental.scala.remote.Main$.nailMain(Main.scala:26) at org.jetbrains.jps.incremental.scala.remote.Main.nailMain(Main.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.martiansoftware.nailgun.NGSession.run(NGSession.java:319) java.util.ConcurrentModificationException at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042) at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996) at io.github.classgraph.ScanResult.getResourcesMatchingPattern(ScanResult.java:383) at .liftedTree1$1(:21) ... 40 elided ```
pshirshov commented 6 years ago

Tried to switch on to Oracle JDK 8.

output ``` import io.github.classgraph.{ClassGraph, ClassInfo} import java.lang.management.ManagementFactory import scala.collection.JavaConverters._ import scala.compat.Platform.ConcurrentModificationException 4.2.6 java.runtime.name -> Java(TM) SE Runtime Environment java.vm.version -> 25.172-b11 java.vm.vendor -> Oracle Corporation java.vendor.url -> http://java.oracle.com/ java.vm.name -> Java HotSpot(TM) 64-Bit Server VM java.vm.specification.name -> Java Virtual Machine Specification java.runtime.version -> 1.8.0_172-b11 java.awt.graphicsenv -> sun.awt.CGraphicsEnvironment java.endorsed.dirs -> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre/lib/endorsed java.io.tmpdir -> /var/folders/1g/t908g0zs44n01tykdw8sq7l40000gn/T/ java.vm.specification.vendor -> Oracle Corporation java.library.path -> /Users/liyon/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. java.specification.name -> Java Platform API Specification java.class.version -> 52.0 java.awt.printerjob -> sun.lwawt.macosx.CPrinterJob java.specification.version -> 1.8 java.class.path -> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/lib/tools.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/jps-builders.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/util.jar:/Users/liyon/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-1/183.3283.2/IntelliJ IDEA 2018.3 EAP.app/Contents/lib/trove4j.jar:/Users/liyon/Library/Application Support/IntelliJIdea2018.3/Scala/lib/scala-reflect.jar:/Users/liyon/Library/Application Support/IntelliJIdea2018.3/Scala/lib/jps/sbt-interface.jar java.vm.specification.version -> 1.8 java.home -> /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre java.specification.vendor -> Oracle Corporation java.vm.info -> mixed mode java.version -> 1.8.0_172 java.ext.dirs -> /Users/liyon/Library/Java/Extensions:/Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre/lib/ext:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java java.vendor -> Oracle Corporation java.vendor.url.bug -> http://bugreport.sun.com/bugreport/ [-Xmx1024m, -Xss1m] java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909) at java.util.ArrayList$Itr.next(ArrayList.java:859) at io.github.classgraph.ScanResult.getResourcesMatchingPattern(ScanResult.java:383) at $line10.$read$$iw$$iw$$iw$$iw$.liftedTree1$1(:21) at $line10.$read$$iw$$iw$$iw$$iw$.(:19) at $line10.$read$$iw$$iw$$iw$$iw$.() at $line10.$eval$.$print$lzycompute(:7) at $line10.$eval$.$print(:6) at $line10.$eval.$print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:735) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:999) at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:567) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:34) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:30) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:33) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:566) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:593) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:563) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperImpl.processChunk(ILoopWrapperImpl.scala:41) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactory.loadReplWrapperAndRun(ILoopWrapperFactory.scala:93) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactory.loadReplWrapperAndRun(ILoopWrapperFactory.scala:35) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.$anonfun$loadReplWrapperAndRun$3(ILoopWrapperFactoryHandler.scala:57) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler$.org$jetbrains$jps$incremental$scala$local$worksheet$ILoopWrapperFactoryHandler$$withFilteredPath(ILoopWrapperFactoryHandler.scala:124) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.$anonfun$loadReplWrapperAndRun$2(ILoopWrapperFactoryHandler.scala:54) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.$anonfun$loadReplWrapperAndRun$2$adapted(ILoopWrapperFactoryHandler.scala:43) at scala.Option.foreach(Option.scala:257) at org.jetbrains.jps.incremental.scala.local.worksheet.ILoopWrapperFactoryHandler.loadReplWrapperAndRun(ILoopWrapperFactoryHandler.scala:43) at org.jetbrains.jps.incremental.scala.local.worksheet.WorksheetServer.loadAndRun(WorksheetServer.scala:29) at org.jetbrains.jps.incremental.scala.remote.Main$.make(Main.scala:85) at org.jetbrains.jps.incremental.scala.remote.Main$.nailMain(Main.scala:26) at org.jetbrains.jps.incremental.scala.remote.Main.nailMain(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.martiansoftware.nailgun.NGSession.run(NGSession.java:319) java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909) at java.util.ArrayList$Itr.next(ArrayList.java:859) at io.github.classgraph.ScanResult.getResourcesMatchingPattern(ScanResult.java:383) at .liftedTree1$1(:21) ... 40 elided ```
pshirshov commented 6 years ago

Tried exactly the same snippet on ClassGraph 4.2.9 - it works.

pshirshov commented 6 years ago

asScala does the following:

  implicit def asScalaBufferConverter[A](l : ju.List[A]): AsScala[mutable.Buffer[A]] =
    new AsScala(asScalaBuffer(l))

  def asScalaBuffer[A](l: ju.List[A]): mutable.Buffer[A] = l match {
    case null                           => null
    case MutableBufferWrapper(wrapped)  => wrapped
    case _                              => new JListWrapper(l)
  }

and JListWrapper is

  case class JListWrapper[A](underlying: ju.List[A]) extends mutable.AbstractBuffer[A] with mutable.Buffer[A] {

I'm not sure what exactly is happening there. From all I can understand, finalizers shouldn't be triggered ahead of time.

lukehutch commented 6 years ago

@pshirshov Thanks for running these tests. I'm glad 4.2.9 seems to have fixed this for now.

However, this bug report made me realize that the change I made in 4.2.9 is insufficient to prevent all issues of this form. Even if the garbage collector does correctly wait until the last reference to a ScanResult to be dropped before it runs the finalizer, the finalizer removes temporary files, and this can cause classloading to fail. So in Scala, if you don't hold a reference to the ScanResult and call finalize in a finally block, then if you call ClassInfoList#loadClasses() or ClassInfo#loadClass() in a method chain like you do, it is possible for classloading to fail.

Therefore, the correct fix for this is not just to upgrade to 4.2.9, but you really must close the ScanResult in a finally block in Scala. I will update the documentation to strengthen the wording around this.

I'll take up the escape analysis issue with the JVM performance team to see if they can isolate why the class is not being marked as GlobalEscape (i.e. why the finalizer is running early).

lukehutch commented 6 years ago
lukehutch commented 6 years ago

@pshirshov So I raised this on the JDK mailing lists, and though I knew lots of people claim finalizers are broken, it turns out that finalizers are more or less "broken by design".

http://mail.openjdk.java.net/pipermail/jdk-dev/2018-October/002067.html

http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-October/034625.html

The Hotspot team made a decision to not include the object instances of currently-running methods in liveness analysis, which feeds into reachability analysis. So if a method is running, it is not considered live (and therefore reachable) unless it can be definitely determined that one of its fields will be accessed in the future -- and therefore, it may be finalized while the method is running.

This is a very bad design decision in my opinion, and something that can be fixed with very low overhead to Hotspot. But they made the decision in JDK 9 to add Reference::reachabilityFence instead, which you have to wrap all methods of finalizable objects with.

This is only available in JDK 9+, and doesn't even solve all cases of finalizers being run early. For example, wrapping the methods of ScanResult with reachabilityFence will still fail to prevent the finalizer being run early in the following example, where ClassInfo objects have a back-ref to the ScanResult:

new ClassGraph().enableClassInfo().scan().getAllClassInfo().loadClasses()

So this is unfortunately a hopeless situation to try to fix, until or unless the Hotspot team change their policy on choosing a tiny performance margin over arguable correctness.

What I will do instead is simply remove the finalizer method, and users of ClassGraph will simply have to deal with resource leaks if they don't close the ScanResult. This is reasonable, as it is the expectation for any other class that extends AutoCloseable or Closeable.

I will release 4.2.10 without the finalizer method momentarily. You should update your code to close ScanResult in a finally block, as shown in the link in the previous comment on this issue.

Thanks for your report, and for your patience -- this bug was very hard to get to the bottom of!

pshirshov commented 6 years ago

I will release 4.2.10 without the finalizer method momentarily. You should update your code to close ScanResult in a finally block, as shown in the link in the previous comment on this issue.

Yup, I've already did that. And yes, I support the drop of finalization logic.

Thanks a lot for your investigation, it's something I was completely unaware of.

lukehutch commented 6 years ago

@pshirshov Here are my release notes summarizing what I learned:

https://github.com/classgraph/classgraph/releases/tag/classgraph-4.2.10

Thanks a lot for your investigation, it's something I was completely unaware of.

You're welcome, and I had no idea of this either -- I was really, really surprised by this. And I was even more surprised that this is not only a well-known issue to the Hotspot team, but that they opted not to fix it for quite insignificant performance reasons. Everybody knows that some things about finalizers are broken, but they had a chance to fix some things about how they worked, to make them a little less broken, and they chose not to fix them.

This really sucks... it's one of a growing, long list of things that I keep discovering as I dive even deeper into the Java ecosystem that are frankly very disappointing :-/

lukehutch commented 6 years ago

@pshirshov Oh, one more thing that came up during the discussions on the JDK mailing lists is that -XX:+DoEscapeAnalysis is now default (adding that option to the JVM invocation has no effect).

andrebrait commented 6 years ago

I know this is closed but... Wow... That's really messed up.

Correctness should be the core thing. Never give up correctness for performance, ffs.

I mean, you expect that what you write is going to be done, not undefined behavior!

lukehutch commented 6 years ago

@andrebrait that was my reaction exactly. Correctness should never be sacrificed for performance. If this meant that you lose function inlining when you override finalize, then so be it, that's the price you pay. The fact that in over a decade of debating this issue, the Java community process was unable to fix this -- explicitly because they didn't want to give up what is probably a tiny performance margin, which it seems nobody even actually benchmarked (it was all assumption) -- shows just how broken the "design by committee" of the JCP is.

Running finalizers late (or not at all) is not necessarily a big deal. Running finalizers early very likely breaks everything that uses finalizers.

lukehutch commented 6 years ago

Some good commentary from David Lloyd of Red Hat:

http://mail.openjdk.java.net/pipermail/jdk-dev/2018-October/002085.html

lukehutch commented 6 years ago

Another interesting email, from Hans Boehm (author of the Boehm garbage collector):


I don't think inlining is the real problem. The general problem is that just because a field of an object is still in use, does not mean that the containing object is still reachable. The register holding the reference may have been reused. e.g. for the field value itself. The receiver reference inside a method is really just like any other parameter. And it's common to pass parameters in registers that can be reused once we no longer need the parameter values, e.g. because we've retrieved the fields we need.

One could solve this by preventing elimination of dead references. But, it's hard to do this only where it matters. Doing it everywhere is an option, but hasn't gotten a lot of traction. There is a more elaborate proposal to make this easier than reachabilityFence at https://docs.google.com/document/d/1yMC4VzZobMQTrVAraMy7xBdWPCJ0p7G5r_43yaqsj-s/edit?usp=sharing

Currently reachabilityFence is by far the best solution. (Passing the reference to an existing native method, e.g. by removing "static" is more subtle, but otherwise as good when it works.) I believe the cost of reachabilityFence in implementations that support it is near zero; all it has to do is keep a value around on the stack or in a register. In my experience, it's quite hard to portably, efficiently, and robustly implement reachabilityFence() where it's not already supported. Passing it to another Java method may work, but is not portable.

ReachabilityFence still has the substantial disadvantage that it often needs to be used in many methods after referring to a field that is cleaned up by a Cleaner or finalizer, etc. The above proposal allows the field declaration to be modified, instead of every method that touches it. But the programmer still needs to know that there is an issue. (@ReachabilitySensitive is a placeholder name. @Cleaned is the current front runner in my vicinity.)

pshirshov commented 6 years ago

Eh, I've uncovered a pandora's box. My world will never be the same.

Though the conclusion is simple: we must never use finalizers except of situations when our app is completely independent of when they may be executed and if they may be executed. Crap-crap-crap.

lukehutch commented 6 years ago
GedMarc commented 6 years ago

that is a good read