apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.28k stars 3.47k forks source link

[Java][FlightSQL] Arrow Flight SQL JDBC Driver makes ActionCreatePreparedStatementRequest even for Statement.executeQuery() and Statement.exeute(). #43622

Open jmao-denver opened 1 month ago

jmao-denver commented 1 month ago

Describe the bug, including details regarding any error messages, version, and platform.

We (Deephaven Data Labs) have a simple FlightSQL implementation that doesn't support PreparedStatement with the understanding that executing ad hoc queries should be possible with JDBC's Statement.execute()/executeQuery(). However, the following test case failed with an exception that indicates the driver chooses to use PreparedStatement anyways. Note that the ADBC driver for FlightSQL (Python) uses CommandStatementQuery for the same use case.

   @Test
    public void testJDBCExecuteQuery() throws SQLException {
        try (Connection connection =  DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:" + localPort +
                "/?Authorization=Anonymous&useEncryption=false")) {
            Statement statement = connection.createStatement();
            ResultSet rs = statement.executeQuery("SELECT * FROM crypto where Instrument='BTC/USD' AND Price > 50000 and Exchange = 'binance'");
            ResultSetMetaData rsmd = rs.getMetaData();
            int columnsNumber = rsmd.getColumnCount();
            while (rs.next()) {
                for (int i = 1; i <= columnsNumber; i++) {
                    if (i > 1) System.out.print(",  ");
                    String columnValue = rs.getString(i);
                    System.out.print(columnValue + " " + rsmd.getColumnName(i));
                }
                System.out.println("");
            }
        }
    }
Method arrow.flight.protocol.FlightService/DoAction is unimplemented
cfjd.org.apache.arrow.flight.FlightRuntimeException: UNIMPLEMENTED: Method arrow.flight.protocol.FlightService/DoAction is unimplemented
    at app//cfjd.org.apache.arrow.flight.CallStatus.toRuntimeException(CallStatus.java:131)
    at app//cfjd.org.apache.arrow.flight.grpc.StatusUtils.fromGrpcRuntimeException(StatusUtils.java:164)
    at app//cfjd.org.apache.arrow.flight.grpc.StatusUtils$1.next(StatusUtils.java:250)
    at app//cfjd.org.apache.arrow.flight.sql.FlightSqlClient$PreparedStatement.<init>(FlightSqlClient.java:941)
    at app//cfjd.org.apache.arrow.flight.sql.FlightSqlClient.prepare(FlightSqlClient.java:728)
    at app//cfjd.org.apache.arrow.flight.sql.FlightSqlClient.prepare(FlightSqlClient.java:708)
    at app//org.apache.arrow.driver.jdbc.client.ArrowFlightSqlClientHandler.prepare(ArrowFlightSqlClientHandler.java:170)
    at app//org.apache.arrow.driver.jdbc.ArrowFlightMetaImpl.prepareAndExecute(ArrowFlightMetaImpl.java:161)
    at app//cfjd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
    at app//cfjd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
    at app//cfjd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
    at app//io.deephaven.flightsql.test.FlightSqlTest.testJDBCExecuteQuery(FlightSqlTest.java:673)
    at java.base@11.0.21/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base@11.0.21/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base@11.0.21/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base@11.0.21/java.lang.reflect.Method.invoke(Method.java:566)
    at app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:688)
    at app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
    at app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
    at app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
    at app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
    at app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
    at app//org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$

This Python script runs OK against the same server.

with adbc_driver_flightsql.dbapi.connect("grpc://localhost:10000",
    conn_kwargs={ "adbc.flight.sql.rpc.call_header.authorization": "Anonymous"}) as conn:
    with conn.cursor() as cursor:
        cursor.execute("SELECT * from crypto")
        pa_table = cursor.fetch_arrow_table()

version: Arrow 13.0.0 platform: MacOS 14.5

Component(s)

Java

devinrsmith commented 1 month ago

It looks like adbc_driver_flightsql first tries doAction / CreatePreparedStatement / ActionCreatePreparedStatement "select from crypto", but then falls back to getFlightInfo / CommandStatementQuery "select from crypto".

It does seem a bit strange that the default behavior is to go the prepared statement route even when the query is ad-hoc? For reference, the sequence diagrams here show the ad-hoc vs prepared statement: https://arrow.apache.org/docs/format/FlightSql.html#sequence-diagrams

aiguofer commented 3 weeks ago

Currently, both JDBC and ADBC driver bindings execute ALL queries (prepared or not) through the AFS PreparedStatement endpoints. See https://github.com/apache/arrow-adbc/issues/2040#issuecomment-2305242075