enso-org / enso

Enso Analytics is a self-service data prep and analysis platform designed for data teams.
https://ensoanalytics.com
Apache License 2.0
7.36k stars 323 forks source link

Enable Apache Arrow for improved Snowflake integration #9475

Closed hubertp closed 6 months ago

hubertp commented 7 months ago

Lack of Arrow is apparently problematic for Snowflake. Enabling it also means we need to add

--add-opens=java.base/java.nio=ALL-UNNAMED

as per https://arrow.apache.org/docs/java/install.html

hubertp commented 7 months ago

Tried to reproduce by enabling arrow in the connection.

So far couldn't reproduce.

hubertp commented 7 months ago
java.lang.RuntimeException: Failed to initialize MemoryUtil. Was Java started with `--add-opens=java.base/java.nio=ALL-UNNAMED`? (See https://arrow.apache.org/docs/java/install.html)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.util.MemoryUtil.<clinit>(MemoryUtil.java:146)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.getDirectBuffer(ArrowBuf.java:234)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.nioBuffer(ArrowBuf.java:229)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.ReadChannel.readFully(ReadChannel.java:87)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.message.MessageSerializer.readMessageBody(MessageSerializer.java:728)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.message.MessageChannelReader.readNext(MessageChannelReader.java:67)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:145)
    at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.setFirstChunkRowCountForArrow(SnowflakeResultSetSerializableV1.java:1159)
    at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.create(SnowflakeResultSetSerializableV1.java:629)
    at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.create(SnowflakeResultSetSerializableV1.java:525)
    at net.snowflake.client.core.SFResultSetFactory.getResultSet(SFResultSetFactory.java:34)
    at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:243)
    at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:149)
    at net.snowflake.client.core.SFStatement.execute(SFStatement.java:785)
    at net.snowflake.client.core.SFStatement.execute(SFStatement.java:693)
    at net.snowflake.client.jdbc.SnowflakeStatementV1.executeQueryInternal(SnowflakeStatementV1.java:296)
    at net.snowflake.client.jdbc.SnowflakePreparedStatementV1.executeQuery(SnowflakePreparedStatementV1.java:151)
    at org.graalvm.truffle/com.oracle.truffle.host.HostMethodDesc$SingleMethod$MHBase.invokeHandle(HostMethodDesc.java:371)
    at org.graalvm.truffle/com.oracle.truffle.host.GuestToHostCodeCache$GuestToHostInvokeHandle.executeImpl(GuestToHostCodeCache.java:88)
    at org.graalvm.truffle/com.oracle.truffle.host.GuestToHostRootNode.execute(GuestToHostRootNode.java:80)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:746)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callInlined(OptimizedCallTarget.java:550)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedRuntimeSupport.callInlined(OptimizedRuntimeSupport.java:250)
    at org.graalvm.truffle/com.oracle.truffle.host.GuestToHostRootNode.guestToHostCall(GuestToHostRootNode.java:102)
    at org.graalvm.truffle/com.oracle.truffle.host.HostMethodDesc$SingleMethod$MHBase.invokeGuestToHost(HostMethodDesc.java:407)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNode.doInvoke(HostExecuteNode.java:876)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNode.doOverloadedCached(HostExecuteNode.java:290)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNodeGen$Inlined.executeAndSpecialize(HostExecuteNodeGen.java:506)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNodeGen$Inlined.execute(HostExecuteNodeGen.java:363)
    at org.graalvm.truffle/com.oracle.truffle.host.HostObject.invokeMember(HostObject.java:464)
    at org.graalvm.truffle/com.oracle.truffle.host.HostObjectGen$InteropLibraryExports$Cached.invokeMemberNode_AndSpecialize(HostObjectGen.java:6701)
    at org.graalvm.truffle/com.oracle.truffle.host.HostObjectGen$InteropLibraryExports$Cached.invokeMember(HostObjectGen.java:6687)
    at org.graalvm.truffle/com.oracle.truffle.api.interop.InteropLibraryGen$CachedDispatch.invokeMember(InteropLibraryGen.java:8477)
    at org.enso.runtime/org.enso.interpreter.node.callable.resolver.HostMethodCallNode.resolveHostMethod(HostMethodCallNode.java:219)
    at org.enso.runtime/org.enso.interpreter.node.callable.resolver.HostMethodCallNodeGen.executeAndSpecialize(HostMethodCallNodeGen.java:157)
    at org.enso.runtime/org.enso.interpreter.node.callable.resolver.HostMethodCallNodeGen.execute(HostMethodCallNodeGen.java:119)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeMethodNode.doPolyglot(InvokeMethodNode.java:524)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeMethodNodeGen.executeAndSpecialize(InvokeMethodNodeGen.java:813)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeMethodNodeGen.execute(InvokeMethodNodeGen.java:507)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeCallableNode.invokeDynamicSymbol(InvokeCallableNode.java:268)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeCallableNodeGen.executeAndSpecialize(InvokeCallableNodeGen.java:218)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeCallableNodeGen.execute(InvokeCallableNodeGen.java:170)
    at org.enso.runtime/org.enso.interpreter.node.callable.ApplicationNode.executeGeneric(ApplicationNode.java:97)
    at org.enso.runtime/org.enso.interpreter.node.scope.AssignmentNodeGen.executeGeneric_generic1(AssignmentNodeGen.java:78)
    at org.enso.runtime/org.enso.interpreter.node.scope.AssignmentNodeGen.executeGeneric(AssignmentNodeGen.java:55)
    at org.enso.runtime/org.enso.interpreter.node.scope.AssignmentNodeGen.executeVoid(AssignmentNodeGen.java:98)
    at org.enso.runtime/org.enso.interpreter.node.callable.function.BlockNode.executeGeneric(BlockNode.java:52)
    at org.enso.runtime/org.enso.interpreter.node.callable.function.BlockNode.executeGeneric(BlockNode.java:54)
    at org.enso.runtime/org.enso.interpreter.node.ClosureRootNode.execute(ClosureRootNode.java:85)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:746)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:535)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:94)
    at org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNode.callDirect(ExecuteCallNode.java:94)
    at org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNodeGen.executeAndSpecialize(ExecuteCallNodeGen.java:171)
    at org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNodeGen.executeCall(ExecuteCallNodeGen.java:101)
    at org.enso.runtime/org.enso.interpreter.node.callable.dispatch.SimpleCallOptimiserNode.executeDispatch(SimpleCallOptimiserNode.java:56)
    at org.enso.runtime/org.enso.interpreter.node.callable.dispatch.CurryNode.doCall(CurryNode.java:161)
    at org.enso.runtime/org.enso.interpreter.node.callable.dispatch.CurryNode.execute(CurryNode.java:107)
    at org.enso.runtime/org.enso.interpreter.node.callable.dispatch.InvokeFunctionNode.invokeCached(InvokeFunctionNode.java:116)
    at org.enso.runtime/org.enso.interpreter.node.callable.dispatch.InvokeFunctionNodeGen.executeAndSpecialize(InvokeFunctionNodeGen.java:137)
    at org.enso.runtime/org.enso.interpreter.node.callable.dispatch.InvokeFunctionNodeGen.execute(InvokeFunctionNodeGen.java:99)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeCallableNode.invokeFunction(InvokeCallableNode.java:167)
    at org.enso.runtime/org.enso.interpreter.node.callable.InvokeCallableNodeGen.execute(InvokeCallableNodeGen.java:125)
    at org.enso.runtime/org.enso.interpreter.node.expression.builtin.resource.BracketNode.doBracket(BracketNode.java:74)
    at org.enso.runtime/org.enso.interpreter.node.expression.builtin.resource.BracketNodeGen.execute(BracketNodeGen.java:49)
    at org.enso.runtime/org.enso.interpreter.node.expression.builtin.resource.BracketMethodGen.execute(BracketMethodGen.java:161)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:746)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:535)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:94)
...
hubertp commented 7 months ago

As per @JaroslavTulach 's request - I will try out his patch and a) create a temporary solution b) submit patch to arrow/snowflake that will deprecate the former once merged

hubertp commented 7 months ago

So I don't think the patch that simply replaces throwing the exception with logging it will work for Snowflake. Steps to reproduce:

  1. Take arrow repo and checkout maint-10.0.x to match snowflake's version
  2. Build it
  3. Unpack existing sources jar -xvf <snowflake-jdbc>/dependencies/arrow-memory-core-10.0.1.jar
  4. cp arrow/java/memory/memory-core/target/classes/org/apache/arrow/memory/util/MemoryUtil* org/apache/arrow/memory/util/
  5. Pack it again jar -cvf ../arrow-memory-core-10.0.1.jar .
  6. Build snowflake jar: ./mvnw clean -DskipTests=true package
  7. Copy it to unmanaged classpath to std-snowflake: cp target/snowflake-jdbc.jar <enso>/std-bits/snowflake/lib/snowflake-jdbc-3.15.0.jar
  8. build distribution and test it on the project

For the custom snowflake to be picked up you will need this rather quick and easy hack to include unmanaged classpath:

iff --git a/project/StdBits.scala b/project/StdBits.scala
index 1e17616de3..76c89bf7a3 100644
--- a/project/StdBits.scala
+++ b/project/StdBits.scala
@@ -44,7 +44,7 @@ object StdBits {
           !graalVmOrgs.contains(orgName)
         })
       )
-      val relevantFiles =
+      val relevantFiles0 =
         libraryUpdates
           .select(
             configuration = configFilter,
@@ -52,6 +52,12 @@ object StdBits {
             artifact      = DependencyFilter.artifactFilter()
           )

+      val relevantFiles = if (destination.getPath.contains("Snowflake")) {
+        val all = (Compile/unmanagedJars).value.map(_.data)
+        relevantFiles0 ++ all
+      } else {
+        relevantFiles0
+      }
       val dependencyStore =
         streams.value.cacheStoreFactory.make("std-bits-dependencies")
       Tracked.diffInputs(dependencyStore, FileInfo.hash)(relevantFiles.toSet) {

and

--- a/build.sbt
+++ b/build.sbt
@@ -513,7 +513,7 @@ val hamcrestVersion         = "1.3"
 val netbeansApiVersion      = "RELEASE180"
 val fansiVersion            = "0.4.0"
 val httpComponentsVersion   = "4.4.1"
-val apacheArrowVersion      = "14.0.1"
+val apacheArrowVersion      = "10.0.1"
 val snowflakeJDBCVersion    = "3.15.0"

 // ============================================================================
@@ -2996,8 +2996,8 @@ lazy val `std-snowflake` = project
     Compile / packageBin / artifactPath :=
       `std-snowflake-polyglot-root` / "std-snowflake.jar",
     libraryDependencies ++= Seq(
-      "org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided",
-      "net.snowflake"    % "snowflake-jdbc"          % snowflakeJDBCVersion
+      "org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided"//,
+      //"net.snowflake"    % "snowflake-jdbc"          % snowflakeJDBCVersion
     ),

After all is done you will still get something along the lines of

Error: There was an SQL error: JDBC driver internal error: Fail to retrieve row count for first arrow chunk: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available.. [Query was: SELECT "NEWRETAILDATA"."INVOICE" AS "INVOICE", "NEWRETAILDATA"."STOCKCODE" AS  …])

when trying to return Arrow rows.

Note that it would be nice to simply replace package Arrow's arrow-memory-core jar with the one checked in to snowflake repo in dependencies directory but it won't work. They seem to have some custom classes there which are nowhere to be found in the official repo.So unpacking and packing jar appears to be the only solution to try out the patch.

JaroslavTulach commented 7 months ago

Fail to retrieve row count for first arrow chunk: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available..

Thank you for the investigation. Can we get a stacktrace that fails on java.nio.DirectByteBuffer or Unsafe access?

hubertp commented 7 months ago

Fail to retrieve row count for first arrow chunk: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available..

Thank you for the investigation. Can we get a stacktrace that fails on java.nio.DirectByteBuffer or Unsafe access?

Roughly

ava.lang.reflect.InaccessibleObjectException: Unable to make private java.nio.DirectByteBuffer(long,long) accessible: module java.base does not "opens java.nio" to unnamed module @61d42275
    at java.base/java.lang.reflect.AccessibleObject.throwInaccessibleObjectException(AccessibleObject.java:391)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:367)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:315)
    at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:194)
    at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:187)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.util.MemoryUtil$2.run(MemoryUtil.java:138)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:319)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.util.MemoryUtil.directBufferConstructor(MemoryUtil.java:131)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.util.MemoryUtil.<clinit>(MemoryUtil.java:96)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.getDirectBuffer(ArrowBuf.java:234)
    at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.nioBuffer(ArrowBuf.java:229)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.ReadChannel.readFully(ReadChannel.java:87)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.message.MessageSerializer.readMessageBody(MessageSerializer.java:728)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.message.MessageChannelReader.readNext(MessageChannelReader.java:67)
    at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:145)
    at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.setFirstChunkRowCountForArrow(SnowflakeResultSetSerializableV1.java:1159)
    at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.create(SnowflakeResultSetSerializableV1.java:629)
    at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.create(SnowflakeResultSetSerializableV1.java:525)
    at net.snowflake.client.core.SFResultSetFactory.getResultSet(SFResultSetFactory.java:34)
    at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:243)
    at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:149)
    at net.snowflake.client.core.SFStatement.execute(SFStatement.java:785)
    at net.snowflake.client.core.SFStatement.execute(SFStatement.java:693)
    at net.snowflake.client.jdbc.SnowflakeStatementV1.executeQueryInternal(SnowflakeStatementV1.java:296)
    at net.snowflake.client.jdbc.SnowflakePreparedStatementV1.executeQuery(SnowflakePreparedStatementV1.java:151)
    at org.graalvm.truffle/com.oracle.truffle.host.HostMethodDesc$SingleMethod$MHBase.invokeHandle(HostMethodDesc.java:371)
    at org.graalvm.truffle/com.oracle.truffle.host.GuestToHostCodeCache$GuestToHostInvokeHandle.executeImpl(GuestToHostCodeCache.java:88)
    at org.graalvm.truffle/com.oracle.truffle.host.GuestToHostRootNode.execute(GuestToHostRootNode.java:80)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:746)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callInlined(OptimizedCallTarget.java:550)
    at org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedRuntimeSupport.callInlined(OptimizedRuntimeSupport.java:250)
    at org.graalvm.truffle/com.oracle.truffle.host.GuestToHostRootNode.guestToHostCall(GuestToHostRootNode.java:102)
    at org.graalvm.truffle/com.oracle.truffle.host.HostMethodDesc$SingleMethod$MHBase.invokeGuestToHost(HostMethodDesc.java:407)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNode.doInvoke(HostExecuteNode.java:876)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNode.doOverloadedCached(HostExecuteNode.java:290)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNodeGen$Inlined.executeAndSpecialize(HostExecuteNodeGen.java:506)
    at org.graalvm.truffle/com.oracle.truffle.host.HostExecuteNodeGen$Inlined.execute(HostExecuteNodeGen.java:363)
    at org.graalvm.truffle/com.oracle.truffle.host.HostObject.invokeMember(HostObject.java:464)
    at org.graalvm.truffle/com.oracle.truffle.host.HostObjectGen$InteropLibraryExports$Cached.invokeMemberNode_AndSpecialize(HostObjectGen.java:6701)
...
enso-bot[bot] commented 7 months ago

Hubert Plociniczak reports a new STANDUP for yesterday (2024-03-26):

Progress: Attempting to patch #9475, as requested, to avoid opening java modules.Snowflake appears to use some custom Arrow version, making the process difficult. It should be finished by 2024-03-27.

Next Day: Next day I will be working on the #9475 task. Continue with the task. Also go back to benchmark issue.

JaroslavTulach commented 7 months ago

at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.getDirectBuffer(ArrowBuf.java:234) at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.nioBuffer(ArrowBuf.java:229)

I assume the code here could just ByteBuffer.slice() rather than trying to obtain address of the ByteBuffer itself...

at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.ReadChannel.readFully(ReadChannel.java:87)

...catching the exception and doing some regular Java operation (ByteBuffer.slice or ByteBuffer.allocateDirect, etc.) would allow us to get further. I'd like to understand the scope of fixes that need to be done to allow Arrow/Snowflake to run on the regular JDK 21.

If you don't share my enthusiasm for patching upstream projects, then please share a reproducer that works with --add-opens and fails with this exception without opening NIO.

hubertp commented 7 months ago

Steps to reproduce:

  1. Custom Snowflake_Details that enables Arrow:
    
    --- a/distribution/lib/Standard/Snowflake/0.0.0-dev/src/Snowflake_Details.enso
    +++ b/distribution/lib/Standard/Snowflake/0.0.0-dev/src/Snowflake_Details.enso
    @@ -46,7 +46,7 @@ type Snowflake_Details
     jdbc_properties : Vector (Pair Text Text)
     jdbc_properties self =
         ## Avoid the Arrow dependency (https://community.snowflake.com/s/article/SAP-BW-Java-lang-NoClassDefFoundError-for-Apache-arrow)
    -        no_arrow = [Pair.new 'jdbc_query_result_format' 'json']
    +        no_arrow = [] #[Pair.new 'jdbc_query_result_format' 'json']
         account = [Pair.new 'account' self.account]
         credentials = [Pair.new 'user' self.credentials.username, Pair.new 'password' self.credentials.password]
         database = [Pair.new 'db' self.database]
2.

from Standard.Snowflake import all from Standard.Database import all from Standard.Base import all

main = operator63293 = "" operator93047 = Credentials.Username_And_Password "" "" connection = Database.connect (Snowflake_Details.Snowflake operator63293 operator93047 '') v = connection.read (SQL_Query.Table_Name '')


You will need a snowflake account with full access to `<DB_NAME>` and `<TABLE_NAME>` (ping @jdunkerley for the account)

If you run runner with `JAVA_OPTS="--add-opens=java.base/java.nio=ALL-UNNAMED"` then there is no crash for the above program.
JaroslavTulach commented 7 months ago

Report from my today's investigation. Arrow version is specified here and it is 10.0.1

Get sources from

wget https://repo1.maven.org/maven2/org/apache/arrow/arrow-memory-core/10.0.1/arrow-memory-core-10.0.1-sources.jar
wget https://repo1.maven.org/maven2/org/apache/arrow/arrow-memory-netty/10.0.1/arrow-memory-netty-10.0.1-sources.jar

Compile as

javac -cp snowflake-jdbc-3.15.0.jar:$HOME/.m2/repository/org/slf4j/slf4j-api/1.7.29/slf4j-api-1.7.29.jar MemoryUtil.java -d .

Alas, the furthest I could get is to:

Caused by: java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available
        at net.snowflake.client.jdbc.internal.apache.arrow.memory.util.MemoryUtil.directBuffer(MemoryUtil.java:186)
        at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.getDirectBuffer(ArrowBuf.java:227)
        at net.snowflake.client.jdbc.internal.apache.arrow.memory.ArrowBuf.nioBuffer(ArrowBuf.java:222)
        at net.snowflake.client.jdbc.internal.apache.arrow.vector.ipc.ReadChannel.readFully(ReadChannel.java:87)

Debugging shows there is https://github.com/apache/arrow/blob/84f6edef697fd0fa0f5fce252c017a31e4ba3944/java/memory/memory-core/src/main/java/org/apache/arrow/memory/DefaultAllocationManagerOption.java#L94 and one can use a property to specify https://github.com/apache/arrow/blob/84f6edef697fd0fa0f5fce252c017a31e4ba3944/java/memory/memory-core/src/main/java/org/apache/arrow/memory/DefaultAllocationManagerOption.java#L39C30-L39C67 allocation manager. However neither Netty or Unsafe allocation managers work without the --add-opens=java.base/java.nio=ALL-UNNAMED option.

At the end they want to convert ArrowBuf to ByteBuffer, but the ArrowBuf only has https://github.com/apache/arrow/blob/84f6edef697fd0fa0f5fce252c017a31e4ba3944/java/memory/memory-core/src/main/java/org/apache/arrow/memory/ArrowBuf.java#L73 addr field - the underlaying buffer is long lost somewhere deep (if it was allocated at all) - we would need to change https://github.com/apache/arrow/blob/84f6edef697fd0fa0f5fce252c017a31e4ba3944/java/memory/memory-netty/src/main/java/org/apache/arrow/memory/netty/NettyAllocationManager.java#L79 to record it. And that's far bigger endeavor than we should be trying.

Looks like PlatformDependent was written by people who don't trust Java GC much...

GitHub
arrow/java/memory/memory-core/src/main/java/org/apache/arrow/memory/DefaultAllocationManagerOption.java at 84f6edef697fd0fa0f5fce252c017a31e4ba3944 · apache/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing - apache/arrow
GitHub
arrow/java/memory/memory-core/src/main/java/org/apache/arrow/memory/DefaultAllocationManagerOption.java at 84f6edef697fd0fa0f5fce252c017a31e4ba3944 · apache/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing - apache/arrow
GitHub
arrow/java/memory/memory-core/src/main/java/org/apache/arrow/memory/ArrowBuf.java at 84f6edef697fd0fa0f5fce252c017a31e4ba3944 · apache/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing - apache/arrow
GitHub
arrow/java/memory/memory-netty/src/main/java/org/apache/arrow/memory/netty/NettyAllocationManager.java at 84f6edef697fd0fa0f5fce252c017a31e4ba3944 · apache/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing - apache/arrow