Closed andychoi-ibm closed 1 year ago
/cc @Karm (mandrel), @galderz (mandrel), @geoand (openshift), @iocanel (openshift), @pedroigor (oidc), @sberyozkin (oidc), @zakkak (mandrel)
Hey @stuartwdouglas @cescoffier @vietj @pmlopes Can you recommend something like configuring Vert.x Client to prefer IPV4 resolution ?
This is probably more like an external setup problem, as per the problem description:
It seems to be hanging resolving the hostname of the configured OIDC auth server while starting the app in OpenShift that doesn't have IPv6 configured.
passing quarkus.native.additional-build-args=-J-Djava.net.preferIPv4Stack=true,-J-Djava.net.preferIPv4Addresses=true
should work around the issue though (i.e. it should not result in java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java)
being invoked) .
@andychoi-ibm could you please provide a maven project that reproduces the issue?
@sberyozkin no much we can do. Vert.x resolves the DNS name using the default JDK API (which is blocking). If the DNS server is slow you get this. However, it should only happen once, then the resolution is cached.
There is an async DNS client in Vert.x. However, it is experimental and is disabled by default.
-Djava.net.preferIPv4Stack=true
would force the resolution to IPv4 - however if the DNS server is slow, it won't help.
Thanks @cescoffier, I'd like to keep this issue open until quarkus.native.additional-build-args=-J-Djava.net.preferIPv4Stack=true,-J-Djava.net.preferIPv4Addresses=true
is cleared, as asked by Foivos
@zakkak You can use https://github.com/quarkusio/quarkus-quickstarts/tree/main/microprofile-graphql-quickstart to reproduce the issue.
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-oidc</artifactId>
</dependency>
quarkus.log.level=DEBUG
quarkus.oidc.enabled=false
quarkus.native.additional-build-args=-J-Djava.net.preferIPv4Stack=true,-J-Djava.net.preferIPv4Addresses=true
%prod.quarkus.oidc.enabled=true
%prod.quarkus.oidc.application-type=service
%prod.quarkus.oidc.discovery-enabled=false
%prod.quarkus.oidc.auth-server-url=https://quarkus.io
%prod.quarkus.oidc.jwks-path=/jwks
Note: quarkus.oidc.auth-server-url can be set to any URL.
podman build -f src/main/docker/Dockerfile.native -t default-route-openshift-image-registry.apps.
podman push default-route-openshift-image-registry.apps.
@cescoffier We're running the app in AWS ROSA and we haven't experienced DNS server slowness with other apps.
Thank you for the reproducer @andychoi-ibm.
Regarding java.net.preferIPv4*
not being respected.
java.net.InetAddress
is registered for class reinitialization by GraalVM https://github.com/oracle/graal/blob/251442c6a74ef7ea93ed5239e669ed2b7fb96214/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/jdk/JNIRegistrationJavaNet.java#L59 which means that no matter what value you pass to java.net.preferIPv4*
at build-time the application will check the value at runtime and use the corresponding stack.
This way we can run the same native executable on different platforms without failing if IPv6 is or isn't available on that platform etc. This also allows us to choose whether to use IPv6 or not even when supported.
So to work-around the issue using java.net.preferIPv4*
you need to move it's definition from build-time to run-time, i.e.:
you may remove quarkus.native.additional-build-args=-J-Djava.net.preferIPv4Stack=true,-J-Djava.net.preferIPv4Addresses=true
and pass -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Addresses=true
when running the container.
@zakkak Thank you for providing the detailed analysis and a solution! Would you be able to recommend a good option to pass the system properties to the running container?
You can edit the docker.native file and pass the property.
Would you be able to recommend a good option to pass the system properties to the running container?
@andychoi-ibm No, unfortunately I am not familiar with that part, I will leave it to someone more familiar (and I will read the answer to learn something new)
You can edit the docker.native file and pass the property.
@cescoffier (a bit off topic) is this the standard way of doing this for JVM-mode as well? I would expect some consistency between the different modes, e.g., if application.properties
is loaded/read when running in JVM-mode, I would expect it to get loaded in native-mode and in containerized builds.
@zakkak @cescoffier Thank you, both! I was able to disable IPv6 by passing the properties via CMD but, unfortunately, got the same result with IPv4 so it seems Clement was right that this is caused by slow DNS resolution.
2023-03-16 21:31:53,736 WARN [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 3999 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked at java.base@17.0.6/java.net.Inet4AddressImpl.lookupAllHostAddr(Inet4AddressImpl.java)
I'm not sure if this is an issue with OpenShift/RHEL but I noticed the issue doesn't occur when I add a simple init container with an echo command, but the total startup time is roughly the same as it takes time to run the init container...
I think we can close this issue as it doesn't seem to be an issue with Quarkus.
The issue no longer occurred once we updated the OCP version to 4.10.53.
This is the manual configuration for Keycloak is:
quarkus.oidc.application-type=service
quarkus.oidc.discovery-enabled=false
quarkus.oidc.token.path=/protocol/openid-connect/token
quarkus.oidc.jwks.path=/protocol/openid-connect/certs
quarkus.oidc.user.info-path=/protocol/openid-connect/userinfo
quarkus.oidc.introspection-path=/protocol/openid-connect/tokens/introspect
PD: Remember that if you are using Keycloak, the parameter: quarkus.oidc.auth-server-url
should end up with your realm, i.e. <your-path>/auth/realms/<your-realm>
Describe the bug
It seems to be hanging resolving the hostname of the configured OIDC auth server while starting the app in OpenShift that doesn't have IPv6 configured. I tried configuring the following config option but got the same result.
quarkus.native.additional-build-args=-J-Djava.net.preferIPv4Stack=true,-J-Djava.net.preferIPv4Addresses=true
The app works otherwise but this prevents the app from starting in milliseconds when running a native executable.
2023-03-15 01:48:23,811 WARN [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 3998 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked at java.base@17.0.6/java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java) at java.base@17.0.6/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:933) at java.base@17.0.6/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1534) at java.base@17.0.6/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:852) at java.base@17.0.6/java.net.InetAddress.getAllByName0(InetAddress.java:1524) at java.base@17.0.6/java.net.InetAddress.getAllByName(InetAddress.java:1381) at java.base@17.0.6/java.net.InetAddress.getAllByName(InetAddress.java:1305) at java.base@17.0.6/java.net.InetAddress.getByName(InetAddress.java:1255) at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:156) at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:153) at java.base@17.0.6/java.security.AccessController.executePrivileged(AccessController.java:144) at java.base@17.0.6/java.security.AccessController.doPrivileged(AccessController.java:569) at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:153) at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:41) at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:61) at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:53) at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:55) at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:31) at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:106) at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:206) at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:162) at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:139) at io.vertx.core.net.impl.ChannelProvider.handleConnect(ChannelProvider.java:155) at io.vertx.core.net.impl.ChannelProvider.connect(ChannelProvider.java:106) at io.vertx.core.net.impl.ChannelProvider.connect(ChannelProvider.java:92) at io.vertx.core.net.impl.NetClientImpl.connectInternal2(NetClientImpl.java:273) at io.vertx.core.net.impl.NetClientImpl.lambda$connectInternal$1(NetClientImpl.java:240) at io.vertx.core.impl.future.FutureImpl$3.onSuccess(FutureImpl.java:141) at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:54) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base@17.0.6/java.lang.Thread.run(Thread.java:833) at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:775) at org.graalvm.nativeimage.builder/com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:203) 2023-03-15 01:48:24,927 DEBUG [io.net.han.ssl.SslHandler] (vert.x-eventloop-thread-1) [id: 0x87f1d756, L:/172.21.21.49:55900 - R:xxx.ibm.com/###.###.###.###:443] HANDSHAKEN: protocol:TLSv1.2 cipher suite:TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
Expected behavior
The native executable starts in milliseconds.
Actual behavior
The native executable starts in 4-5 seconds.
How to Reproduce?
Steps to reproduce the behavior.
Output of
uname -a
orver
Linux graphql-api-7f776b8c48-2hn4f 4.18.0-305.57.1.el8_4.x86_64 #1 SMP Tue Jul 5 17:55:55 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
Output of
java -version
java version "17.0.5" 2022-10-18 IBM Semeru Runtime Certified Edition 17.0.5.0 (build 17.0.5+8) Eclipse OpenJ9 VM 17.0.5.0 (build openj9-0.35.0, JRE 17 Linux amd64-64-Bit Compressed References 20221018_272 (JIT enabled, AOT enabled) OpenJ9 - e04a7f6c1 OMR - 85a21674f JCL - e1263b46ebe based on jdk-17.0.5+8)
GraalVM version (if different from Java)
mandrel-java17-22.3.1.0-Final
Quarkus version or git rev
2.16.0.Final
Build tool (ie. output of
mvnw --version
orgradlew --version
)Maven home: /home/xxx/apache-maven-3.9.0 Java version: 17.0.5, vendor: IBM Corporation, runtime: /opt/ibm/ibm-semeru-certified-17-jdk Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "4.18.0-425.10.1.el8_7.x86_64", arch: "amd64", family: "unix"
Additional information
No response