Closed adutra closed 4 years ago
Imported from JAVA-2616, originally commented by @phact on Mon, 20 Jan 2020 17:28:58 GMT.
I confirm that the workaround to `force-java-clock` circumvents the error 🙂
datastax-java-driver { advanced.timestamp-generator.force-java-clock = true}
in `application.conf`
I’ll pull debug logs to see if we can find out why it’s not switching to java clocks automatically.
Imported from JAVA-2616, originally commented by @phact on Mon, 20 Jan 2020 18:44:15 GMT.
Here we go:
13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.retry.DefaultRetryPolicy 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a SpeculativeExecutionPolicy from config option advanced.speculative-execution-policy.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name NoSpeculativeExecutionPolicy 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.specex.NoSpeculativeExecutionPolicy 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a AddressTranslator from config option advanced.address-translator.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name PassThroughAddressTranslator 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.addresstranslation.PassThroughAddressTranslator 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a NodeStateListener from config option advanced.node-state-listener.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name NoopNodeStateListener 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.metadata.NoopNodeStateListener 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a RequestTracker from config option advanced.request-tracker.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name NoopRequestTracker 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.tracker.NoopRequestTracker 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a RequestThrottler from config option advanced.throttler.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name PassThroughRequestThrottler 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.session.throttling.PassThroughRequestThrottler 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a AuthProvider from config option advanced.auth-provider.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Option is not defined, skipping 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a SslEngineFactory from config option advanced.ssl-engine-factory.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Option is not defined, skipping 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a TimestampGenerator from config option advanced.timestamp-generator.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name AtomicTimestampGenerator 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.time.AtomicTimestampGenerator 13:42:29 INFO [co.da.os.dr.in.co.ti.Clock] (s0-admin-0) Using native clock for microsecond precision 13:42:29 DEBUG [co.da.os.dr.in.co.co.EventBus] (s0-admin-0) [s0] Registering com.datastax.oss.driver.internal.core.metadata.LoadBalancingPolicyWrapper$$Lambda$bf70e5ca416854d866be64436cd33a1fab9e8815@7f07db471b30 for class com.datastax.oss.driver.internal.core.metadata.NodeStateEvent 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Creating a TimestampGenerator from config option advanced.timestamp-generator.class 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Building from unqualified name AtomicTimestampGenerator 13:42:29 DEBUG [co.da.os.dr.in.co.ut.Reflection] (s0-admin-0) Trying with default package com.datastax.oss.driver.internal.core.time.AtomicTimestampGenerator 13:42:29 INFO [co.da.os.dr.in.co.ti.Clock] (s0-admin-0) Using native clock for microsecond precision 13:42:32 INFO [pr.ProjectResource] (main) Could not connect to cassandra java.lang.IllegalArgumentException: Error instantiating class AtomicTimestampGenerator (specified by advanced.timestamp-generator.class): com.kenai.jffi.Foreign.invokeN2O1(JJJJLjava/lang/Object;III)J [symbol: Java_com_kenai_jffi_Foreign_invokeN2O1 or Java_com_kenai_jffi_Foreign_invokeN2O1__JJJJLjava_lang_Object_2III] at com.datastax.oss.driver.internal.core.util.Reflection.buildFromConfig(Reflection.java:236) at com.datastax.oss.driver.internal.core.util.Reflection.buildFromConfig(Reflection.java:94) at com.datastax.oss.driver.internal.core.context.DefaultDriverContext.buildTimestampGenerator(DefaultDriverContext.java:363) at com.datastax.oss.driver.internal.core.util.concurrent.LazyReference.get(LazyReference.java:55) at com.datastax.oss.driver.internal.core.context.DefaultDriverContext.getTimestampGenerator(DefaultDriverContext.java:732) at com.datastax.oss.driver.internal.core.session.DefaultSession$SingleThreaded.init(DefaultSession.java:327) at com.datastax.oss.driver.internal.core.session.DefaultSession$SingleThreaded.access$1000(DefaultSession.java:280) at com.datastax.oss.driver.internal.core.session.DefaultSession.lambda$init$0(DefaultSession.java:126) at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:73) at io.netty.channel.DefaultEventLoop.run(DefaultEventLoop.java:54) at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:460) at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:193) Caused by: java.lang.UnsatisfiedLinkError: com.kenai.jffi.Foreign.invokeN2O1(JJJJLjava/lang/Object;III)J [symbol: Java_com_kenai_jffi_Foreign_invokeN2O1 or Java_com_kenai_jffi_Foreign_invokeN2O1__JJJJLjava_lang_Object_2III] at com.oracle.svm.jni.access.JNINativeLinkage.getOrFindEntryPoint(JNINativeLinkage.java:145) at com.oracle.svm.jni.JNIGeneratedMethodSupport.nativeCallAddress(JNIGeneratedMethodSupport.java:57) at com.kenai.jffi.Foreign.invokeN2O1(Foreign.java) at com.kenai.jffi.Invoker.invokeN2(Invoker.java:1038) at com.datastax.oss.driver.internal.core.os.Native$LibCLoader$LibC$jnr$ffi$2.gettimeofday(Unknown Source) at com.datastax.oss.driver.internal.core.os.Native.currentTimeMicros(Native.java:57) at com.datastax.oss.driver.internal.core.time.NativeClock.fetchTimeMicros(NativeClock.java:74) at com.datastax.oss.driver.internal.core.time.NativeClock.<init>(NativeClock.java:57) at com.datastax.oss.driver.internal.core.time.Clock.getInstance(Clock.java:41) at com.datastax.oss.driver.internal.core.time.MonotonicTimestampGenerator.buildClock(MonotonicTimestampGenerator.java:109) at com.datastax.oss.driver.internal.core.time.MonotonicTimestampGenerator.<init>(MonotonicTimestampGenerator.java:43) at com.datastax.oss.driver.internal.core.time.AtomicTimestampGenerator.<init>(AtomicTimestampGenerator.java:52) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.datastax.oss.driver.internal.core.util.Reflection.buildFromConfig(Reflection.java:229) ... 16 more
Imported from JAVA-2616, originally commented by @adutra on Mon, 20 Jan 2020 19:00:39 GMT.
Thank you @phact ! From the logs above we see:
INFO [co.da.os.dr.in.co.ti.Clock] (s0-admin-0) Using native clock for microsecond precision
That’s weird because it shows evidence that the driver could call gettimeofday once when loading the Native class, but a subsequent call to that syscall fails with a linkage error.
Imported from JAVA-2616, originally commented by @absurdfarce on Fri, 13 Mar 2020 16:45:13 GMT.
Just recording observations here, mainly so that I don’t forget them
Underlying cause here looks to be an inability to find the native jffi libs in order to execute the JNI call. Some of the proxies generated by jnr-ffi will also chuck ULE in some cases but by the time we get to the code in the stack trace above we’re already past that point.
The named function is certainly in the underlying JAR, however, as one might expect:
[mersault@localhost tmp]$ cp /work/maven/repo/com/github/jnr/jffi/1.2.19/jffi-1.2.19-native.jar . [mersault@localhost tmp]$ jar xf jffi-1.2.19-native.jar [mersault@localhost tmp]$ nm jni/x86_64-Linux/libjffi-1.2.so | grep invokeN2O1 00000000000086f0 T Java_com_kenai_jffi_Foreign_invokeN2O1
So this sure looks to be an outright linkage problem.
It’s worth noting that we don’t see these issues with a basic app running in Graal. https://github.com/absurdfarce/graal-cassandra-driver includes just such an application which loads the jffi native code without issue. So this certainly seems to be something in Quarkus.
The native JARs weren’t originally included as resources in the native image, but even after I add those in and toss in a few other fixes from the discussion in https://datastax-oss.atlassian.net/browse/JAVA-2663 the problem is still there. I even tried a custom driver build with the libc fix mentioned in https://datastax-oss.atlassian.net/browse/JAVA-2663 and it didn’t make any difference, which was expected (since the libc version used shouldn’t impact underling linkage of native code) but reasonable to rule out. https://github.com/absurdfarce/cassandra-quarkus-starter contains the results of that investigation to date.
At this point i don’t have explanations to the following two points:
Imported from JAVA-2616, originally commented by @adutra on Sat, 14 Mar 2020 20:06:11 GMT.
@absurdfarce thanks for this thorough investigation! Wrt to the initial problem:
Native.isCurrentTimeMicrosAvailable() is returning true here, but the native call isn't available.
If I understand you correctly, we still don’t know what’s happening, do we?
Imported from JAVA-2616, originally commented by @absurdfarce on Mon, 16 Mar 2020 04:54:34 GMT.
@adutra Yes, that’s correct.
I believe when I ran this via a debugger I wasn’t seeing the expected deployment of the native lib into the tmp directory. Such a failure would at least explain the current exception, although it just pushes the problem back a step (i.e. why isn’t the lib deploying in that case?) I need to re-test that to confirm, though… that’s all from memory and it’s possible I’ve confused that case with something else I saw.
Even if that is correct, it doesn’t explain why the initial native call succeeds and returns a value of 0 while subsequent calls result in an exception. The static initializers in Native use exactly that test to determine whether gettimeofday() is available; only thing I can figure is that the generated proxy is (fraudulently) returning zero on the initial call as some kind of setup operation and then trying the native linkage from that point forward. But that’s still just speculation; I haven’t been able to nail down the root cause yet.
Imported from JAVA-2616, originally commented by @tomekl007 on Mon, 30 Mar 2020 12:56:43 GMT.
Currently, I’ve implemented native support (quarkus, graal) by setting:TIMESTAMP_GENERATOR_FORCE_JAVA_CLOCK:
https://github.com/datastax/cassandra-quarkus/pull/1/files#diff-c8fa2affa6aa6a83bbe9b43abebe85e6R67
and using @Substitute:
https://github.com/datastax/cassandra-quarkus/pull/1/files#diff-d549fb86e4c13de9dbd13abb41da510bR19
to change the isGetProcessIdAvailable to return always false - due to that fact, the JNR code is never called in the Native setup.
Using this approach, I’ve managed to run Native after building integration-tests module:
mvn clean package -Pnative-image -DskipTests
And tests that are using @NativeImageTest are passing:
I found only one extension that is using native calls, which is Kafka streams for underlying rocksdb:
https://github.com/quarkusio/quarkus/blob/master/extensions/kafka-streams/deployment/src/main/java/io/quarkus/kafka/streams/deployment/KafkaStreamsProcessor.java#L113
They are registering this as JniRuntimeAccessBuildItem.
I think that we should continue to work on this ticket as a followup after the first quarkus PR:
https://github.com/datastax/cassandra-quarkus/pull/1/
and try to enable JNR support there (removing substitutions, register JNI).
We should use existing @NativeImageTest as a baseline to verify if the native is working.
Here is a conversation that may be helpful as well:
https://github.com/oracle/graal/issues/675
Imported from JAVA-2616, originally commented by @absurdfarce on Mon, 13 Apr 2020 16:20:38 GMT.
https://datastax-oss.atlassian.net/browse/JAVA-2663 will provide a more complete solution to this issue so there’s not really much else to do with this ticket.
Imported from [JAVA-2616](https://datastax-oss.atlassian.net/browse/JAVA-2616), originally reported by adutra on Mon, 20 Jan 2020 17:08:32 GMT.
Under GraalVM, the driver is unable to instantiate AtomicTimestampGenerator:
It seems that Native.isCurrentTimeMicrosAvailable() is returning true here, but the native call isn't available.
We should 1) investigate what's happening and 2) maybe also improve the implementation of com.datastax.oss.driver.internal.core.time.Clock#getInstance to catch LinkageError and automatically switch to Java clocks.
┆Issue is synchronized with this Jira Task by Unito