apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
447 stars 100 forks source link

Unable to Run tests of CometShuffleEncryptionSuite #367

Open ganeshkumar269 opened 2 weeks ago

ganeshkumar269 commented 2 weeks ago

Describe the bug

when ever I execute the test class CometShuffleEncryptionSuite, the tests fails/crashes with no meaningful error message, though I get this warning message

WARNING: /Library/Java/JavaVirtualMachines/openjdk-11.jdk/Contents/Home/bin/java is loading libcrypto in an unsafe way

In Intellij IDEA, I enabled "Log - Stack trace" for all the exceptions, I received the below error message,

Exception 'sun.nio.fs.UnixException' occurred in thread 'Executor task launch worker for task 0.0 in stage 1.0 (TID 1)' at sun.nio.fs.UnixNativeDispatcher.lstat(UnixNativeDispatcher.java:335)
    at sun.nio.fs.UnixNativeDispatcher.lstat(UnixNativeDispatcher.java:335)
    at sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:72)
    at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:232)
    at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:110)
    at java.nio.file.Files.deleteIfExists(Files.java:1181)
    at java.nio.file.Files.copy(Files.java:3055)
    at org.apache.commons.crypto.NativeCodeLoader.extractLibraryFile(NativeCodeLoader.java:149)
    at org.apache.commons.crypto.NativeCodeLoader.findNativeLibrary(NativeCodeLoader.java:237)
    at org.apache.commons.crypto.NativeCodeLoader.loadLibrary(NativeCodeLoader.java:279)
    at org.apache.commons.crypto.NativeCodeLoader.<clinit>(NativeCodeLoader.java:52)
    at org.apache.commons.crypto.Crypto.isNativeCodeLoaded(Crypto.java:140)
    at org.apache.commons.crypto.random.OpenSslCryptoRandom.<clinit>(OpenSslCryptoRandom.java:54)
    at java.lang.Class.forName0(Class.java:-1)
    at java.lang.Class.forName(Class.java:398)
    at org.apache.commons.crypto.utils.ReflectionUtils.getClassByNameOrNull(ReflectionUtils.java:134)
    at org.apache.commons.crypto.utils.ReflectionUtils.getClassByName(ReflectionUtils.java:101)
    at org.apache.commons.crypto.random.CryptoRandomFactory.getCryptoRandom(CryptoRandomFactory.java:197)
    at org.apache.spark.security.CryptoStreamUtils$.createInitializationVector(CryptoStreamUtils.scala:138)
    at org.apache.spark.security.CryptoStreamUtils$.createCryptoOutputStream(CryptoStreamUtils.scala:56)
    at org.apache.spark.serializer.SerializerManager.$anonfun$wrapForEncryption$3(SerializerManager.scala:151)
    at org.apache.spark.serializer.SerializerManager$$Lambda$2902.217692044.apply(Unknown Source:-1)
    at scala.Option.map(Option.scala:230)
    at org.apache.spark.serializer.SerializerManager.wrapForEncryption(SerializerManager.scala:151)
    at org.apache.spark.serializer.SerializerManager.wrapStream(SerializerManager.scala:134)
    at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:163)
    at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:306)
    at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:171)
    at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:101)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
    at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
    at org.apache.spark.scheduler.Task.run(Task.scala:139)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
    at org.apache.spark.executor.Executor$TaskRunner$$Lambda$1605.2076642299.apply(Unknown Source:-1)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(Thread.java:829)
Exception 'java.lang.ClassNotFoundException' occurred in thread 'driver-heartbeater' at sun.reflect.misc.MethodUtil.findClass(MethodUtil.java:328)
    at sun.reflect.misc.MethodUtil.findClass(MethodUtil.java:328)
    at sun.reflect.misc.MethodUtil.loadClass(MethodUtil.java:309)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:527)
    at jdk.internal.misc.Unsafe.defineClass0(Unsafe.java:-1)
    at jdk.internal.misc.Unsafe.defineClass(Unsafe.java:1192)
    at jdk.internal.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
    at jdk.internal.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:400)
    at jdk.internal.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:394)
    at java.security.AccessController.doPrivileged(AccessController.java:-1)
    at jdk.internal.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:393)
    at jdk.internal.reflect.MethodAccessorGenerator.generateMethod(MethodAccessorGenerator.java:75)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:53)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:566)
    at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
    at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:193)
    at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:175)
    at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:117)
    at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:54)
    at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
    at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
    at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
    at javax.management.StandardMBean.getAttribute(StandardMBean.java:372)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:641)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
    at com.sun.jmx.mbeanserver.MXBeanProxy$GetHandler.invoke(MXBeanProxy.java:122)
    at com.sun.jmx.mbeanserver.MXBeanProxy.invoke(MXBeanProxy.java:167)
    at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:258)
    at com.sun.proxy.$Proxy26.getMemoryUsed(Unknown Source:-1)
    at org.apache.spark.metrics.MBeanExecutorMetricType.getMetricValue(ExecutorMetricType.scala:67)
    at org.apache.spark.metrics.SingleValueExecutorMetricType.getMetricValues(ExecutorMetricType.scala:46)
    at org.apache.spark.metrics.SingleValueExecutorMetricType.getMetricValues$(ExecutorMetricType.scala:44)
    at org.apache.spark.metrics.MBeanExecutorMetricType.getMetricValues(ExecutorMetricType.scala:60)
    at org.apache.spark.executor.ExecutorMetrics$.$anonfun$getCurrentMetrics$1(ExecutorMetrics.scala:103)
    at org.apache.spark.executor.ExecutorMetrics$.$anonfun$getCurrentMetrics$1$adapted(ExecutorMetrics.scala:102)
    at org.apache.spark.executor.ExecutorMetrics$$$Lambda$2903.1612655416.apply(Unknown Source:-1)
    at scala.collection.Iterator.foreach(Iterator.scala:943)
    at scala.collection.Iterator.foreach$(Iterator.scala:943)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
    at scala.collection.IterableLike.foreach(IterableLike.scala:74)
    at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
    at org.apache.spark.executor.ExecutorMetrics$.getCurrentMetrics(ExecutorMetrics.scala:102)
    at org.apache.spark.SparkContext.reportHeartBeat(SparkContext.scala:2638)
    at org.apache.spark.SparkContext.$anonfun$new$21(SparkContext.scala:583)
    at org.apache.spark.SparkContext$$Lambda$733.912784040.apply$mcV$sp(Unknown Source:-1)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2088)
    at org.apache.spark.Heartbeater$$anon$1.run(Heartbeater.scala:46)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:-1)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(Thread.java:829)
WARNING: /Library/Java/JavaVirtualMachines/openjdk-11.jdk/Contents/Home/bin/java is loading libcrypto in an unsafe way

kindly help on how I can successfully execute the tests.

Steps to reproduce

On Intellij IDEA IDE, click on Run Test button beside the CometShuffleEncryptionSuite class.

Expected behavior

No response

Additional context

No response

viirya commented 2 weeks ago

Have you tried to run it in command line?

ganeshkumar269 commented 2 weeks ago

./mvnw test -Dsuites=org.apache.comet.exec.CometShuffleEncryptionSuite

ran the above command, same result

akash-c-dev commented 1 week ago

@ganeshkumar269 / @viirya Encountered the same issue. Looks to be specific to macOS Sonoma. On my mac, the test case is failing, but not on ubuntu.

On further debugging issue exists in apache/spark too AuthEngineSuite.java#testEncryptedMessage

And if you backtrack you'll notice similar issues on commons-crypto library.

I was able to get it working on Mac by adding this

-Dcommons.crypto.lib.path=/usr/local/lib/ -Dcommons.crypto.lib.name=libcrypto.dylib

to https://github.com/apache/datafusion-comet/blob/main/pom.xml#L86

viirya commented 1 week ago

WARNING: /Library/Java/JavaVirtualMachines/openjdk-11.jdk/Contents/Home/bin/java is loading libcrypto in an unsafe way

We encountered similar issue in early setting up of CI pipelines on Mac: https://github.com/apache/datafusion-comet/issues/76