oracle / graal

GraalVM compiles Java applications into native executables that start instantly, scale fast, and use fewer compute resources 🚀
https://www.graalvm.org
Other
20.21k stars 1.62k forks source link

Runtime failure in native image when `java.nio.channels.Selector` is `.wakeup` #4318

Closed helins closed 1 year ago

helins commented 2 years ago

Describe the issue

Initially, a native image of a Clojure program was behaving very well. After integrating a Java library doing some networking with NIO, the image now fails at runtime when a client tries connecting to a server through TCP. Precisely, when .wakeup is called on a java.nio.channels.Selector. See exception below.

No such problem occurs when running the uberjar used to creating the image.

Steps to reproduce the issue

Unfortunately the setup is complex and part of the codebase cannot be disclosed.

Describe GraalVM and your environment:

More details

Exception:

#error {
 :cause Bad file descriptor
 :via
 [{:type java.lang.InternalError
   :message java.io.IOException: Bad file descriptor
   :at [sun.nio.ch.KQueueSelectorImpl wakeup KQueueSelectorImpl.java 295]}
  {:type java.io.IOException
   :message Bad file descriptor
   :at [com.oracle.svm.jni.JNIJavaCallWrappers jniInvoke_VA_LIST_IOException_constructor_dfe3139624312c30e7f76f0e723ef6192050fcf4 JNIJavaCallWrappers.java 0]}]
 :trace
 [[com.oracle.svm.jni.JNIJavaCallWrappers jniInvoke_VA_LIST_IOException_constructor_dfe3139624312c30e7f76f0e723ef6192050fcf4 JNIJavaCallWrappers.java 0]
  [sun.nio.ch.IOUtil write1 IOUtil.java -1]
  [sun.nio.ch.KQueueSelectorImpl wakeup KQueueSelectorImpl.java 293]

Building the image:

 Warning: Ignoring server-mode native-image argument --no-server.
Executing [
/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/bin/java \
-XX:+UseParallelGC \
-XX:+UnlockExperimentalVMOptions \
-XX:+EnableJVMCI \
-Dtruffle.TrustAllTruffleRuntimeProviders=true \
-Dtruffle.TruffleRuntime=com.oracle.truffle.api.impl.DefaultTruffleRuntime \
-Dgraalvm.ForcePolyglotInvalid=true \
-Dgraalvm.locatorDisabled=true \
-Dsubstratevm.IgnoreGraalVersionCheck=true \
--add-exports=java.base/com.sun.crypto.provider=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.access.foreign=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.event=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.loader=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.logger=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.misc=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.module=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.org.objectweb.asm=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.org.xml.sax.helpers=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.perf=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.platform=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.ref=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.util.xml.impl=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.util.xml=ALL-UNNAMED \
--add-exports=java.base/sun.invoke.util=ALL-UNNAMED \
--add-exports=java.base/sun.net=ALL-UNNAMED \
--add-exports=java.base/sun.nio.ch=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.annotation=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.repository=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.tree=ALL-UNNAMED \
--add-exports=java.base/sun.security.jca=ALL-UNNAMED \
--add-exports=java.base/sun.security.provider=ALL-UNNAMED \
--add-exports=java.base/sun.security.ssl=ALL-UNNAMED \
--add-exports=java.base/sun.security.util=ALL-UNNAMED \
--add-exports=java.base/sun.text.spi=ALL-UNNAMED \
--add-exports=java.base/sun.util.calendar=ALL-UNNAMED \
--add-exports=java.base/sun.util.locale.provider=ALL-UNNAMED \
--add-exports=java.base/sun.util.resources=ALL-UNNAMED \
--add-exports=java.management/sun.management=ALL-UNNAMED \
--add-exports=java.xml.crypto/org.jcp.xml.dsig.internal.dom=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.aarch64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.amd64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.code.site=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.code.stack=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.code=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.common=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.hotspot.aarch64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.hotspot.amd64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.hotspot=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.meta=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.runtime=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.services=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.events=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal.consumer=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal.handlers=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal.jfc=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal=ALL-UNNAMED \
-XX:+UseJVMCINativeLibrary \
-Xss10m \
-Xms1g \
-Xmx14g \
-Djava.awt.headless=true \
-Dorg.graalvm.version=22.0.0.2 \
-Dorg.graalvm.config=CE \
-Dcom.oracle.graalvm.isaot=true \
-Djava.system.class.loader=com.oracle.svm.hosted.NativeImageSystemClassLoader \
-Xshare:off \
-Djdk.internal.lambda.disableEagerInitialization=true \
-Djdk.internal.lambda.eagerlyInitialize=false \
-Djava.lang.invoke.InnerClassLambdaMetafactory.initializeLambdas=false \
-javaagent:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/svm.jar \
-cp \
/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/llvm-wrapper-shadowed.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/svm.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/native-image-base.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/objectfile.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/llvm-platform-specific-shadowed.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/javacpp-shadowed.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/svm-llvm.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/builder/pointsto.jar \
--module-path \
/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/truffle/truffle-api.jar \
'com.oracle.svm.hosted.NativeImageGeneratorRunner$JDK9Plus' \
-imagecp \
/Users/adam/Code/convex/clj/tool/build/uberjar/project/run.jar:/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/library-support.jar \
-H:Path=/Users/adam/Code/convex/clj/tool \
'-H:Class@manifest from file:///Users/adam/Code/convex/clj/tool/build/uberjar/project/run.jar=convex.run' \
'-H:Name@manifest from file:///Users/adam/Code/convex/clj/tool/build/uberjar/project/run.jar=run' \
'-H:Name@jar:file:///Users/adam/Code/convex/clj/tool/build/uberjar/project/run.jar!/META-INF/native-image/world.convex/run.clj/native-image.properties=build/native/project/run' \
'-H:ReflectionConfigurationFiles@jar:file:///Users/adam/Code/convex/clj/tool/build/uberjar/project/run.jar!/META-INF/native-image/world.convex/run.clj/native-image.properties=project/run/reflect-config.json' \
-H:ClassInitialization=:build_time \
-H:FallbackThreshold=0 \
-H:+DumpTargetInfo \
-H:+ReportExceptionStackTraces \
-H:CLibraryPath=/Users/adam/.sdkman/candidates/java/22.0.0.2.r17-grl/lib/svm/clibraries/darwin-amd64
]
========================================================================================================================
GraalVM Native Image: Generating 'build/native/project/run'...
========================================================================================================================
[1/7] Initializing...                                                                                   (10,0s @ 0,21GB)
--initialize-at-build-time without arguments has been deprecated when not using --diagnostics-mode. With GraalVM 22.0.0 --initialize-at-build-time will only work with --diagnostics-mode for debugging purposes.
The reason for deprecation is that --initalize-at-build-time does not compose, i.e., a single library can make assumptions that the whole classpath can be safely initialized at build time; that assumption is often incorrect.
 Version info: 'GraalVM 22.0.0.2 Java 17 CE'
# Printing compilation-target information to: /Users/adam/Code/convex/clj/tool/reports/target_info_20220213_104200.txt
[2/7] Performing analysis...  [***********]                                                             (63,5s @ 4,15GB)
# Printing native-library information to: /Users/adam/Code/convex/clj/tool/reports/native_library_info_20220213_104304.txt
  12.940 (96,61%) of 13.394 classes reachable
  14.642 (46,62%) of 31.405 fields reachable
  52.834 (80,38%) of 65.734 methods reachable
     359 classes,   103 fields, and   810 methods registered for reflection
      63 classes,    71 fields, and    54 methods registered for JNI access
[3/7] Building universe...                                                                               (4,0s @ 4,85GB)
[4/7] Parsing methods...      [**]                                                                       (3,9s @ 1,64GB)
[5/7] Inlining methods...     [*****]                                                                    (2,8s @ 2,68GB)
[6/7] Compiling methods...    [*****]                                                                   (29,3s @ 2,73GB)
[7/7] Creating image...                                                                                  (8,3s @ 4,96GB)
  26,35MB (36,57%) for code area:   37.583 compilation units
  39,07MB (54,22%) for image heap:  10.332 classes and 541.709 objects
   6,64MB ( 9,21%) for other data
  72,06MB in total
------------------------------------------------------------------------------------------------------------------------
Top 10 packages in code area:                               Top 10 object types in image heap:
   3,59MB clojure                                              8,34MB byte[] for general heap data
   1,62MB sun.security.ssl                                     5,98MB java.lang.Class
   1,46MB clojure.lang                                         2,48MB java.lang.String
 990,65KB taoensso                                             2,41MB java.lang.Object[]
 923,26KB java.util                                            2,05MB byte[] for java.lang.String
 761,05KB org.bouncycastle.math.ec.custom.sec                  1,60MB convex.core.data.RefDirect
 720,20KB com.sun.crypto.provider                              1,20MB clojure.lang.PersistentHashMap$BitmapIndexedNode
 686,80KB slf4j_timbre                                       813,80KB clojure.lang.Symbol
 637,23KB jdk.internal.net.http                              614,67KB convex.core.data.VectorLeaf
 568,51KB clojure.spec                                       585,09KB java.util.concurrent.ConcurrentHashMap$Node
      ... 322 additional packages                                 ... 5603 additional object types
                                           (use GraalVM Dashboard to see all)
------------------------------------------------------------------------------------------------------------------------
                        5,5s (4,3% of total time) in 36 GCs | Peak RSS: 7,10GB | CPU load: 3,74
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /Users/adam/Code/convex/clj/tool/build/native/project/run (executable)
 /Users/adam/Code/convex/clj/tool/build/native/project/run.build_artifacts.txt
========================================================================================================================
Finished generating 'build/native/project/run' in 2m 6s.
rodrigar-mx commented 2 years ago

Hi @helins. Please try running the native image tracing agent in case you have not. Otherwise, it will be very useful getting a reproducer for this issue because the error log is not giving enough information. I read that you cannot share your project, but please share a smaller program which run same nio function and results in the same error.

helins commented 2 years ago

Here is the result of running the tracing agent on an uber jar and reproducing the steps that would produce the aforementioned exception if done on a native image: native-agent-tracing-agent.zip

I can try to make a simplified build accessible. However it's not trivial: it's a Clojure code base requiring Clojure tooling with build scripts written for Babashka. I'm afraid you'll most likely spend way too much time just setting up the repo. Maybe I can be more helpful if you indicate me what would you like to inspect?

helins commented 1 year ago

Good news, turns out it was due to the selector being initialized in a static {} block in a class silently initialized at build time. When initialized at run time, it seems to work properly.