snowflakedb / snowflake-jdbc

Snowflake JDBC Driver
Apache License 2.0
176 stars 167 forks source link

SNOW-840018 Allow reading Arrow record batch streams from result set #1422

Open BryanCutler opened 1 year ago

BryanCutler commented 1 year ago

I would like to read the result set of a query as streams of Arrow record batches. The QueryResultFormat.ARROW provides serialization of data in Arrow stream format, but I don't see a way I can read those streams directly. Can this be exposed to read directly similar to the ArrowStreamLoader in gosnowflake? see https://github.com/snowflakedb/gosnowflake/blob/master/connection.go#L577

What is the current behavior?

Arrow format is used to serialize data from a result set, but doesn't seem to be exposed to read directly.

What is the desired behavior?

Read the serialized Arrow streams directly.

How would this improve snowflake-jdbc?

By consuming Arrow data directly, entire batches can be read at a time instead of each scalar value, increasing performance.

References, Other Background

Similar interface is provided in gosnowflake, https://github.com/snowflakedb/gosnowflake/blob/master/connection.go#L577

Currently the Arrow ADBC driver makes use of it and shows good performance gains https://github.com/apache/arrow-adbc/blob/main/go/adbc/driver/snowflake/record_reader.go#L242

Arrow stream is being read here https://github.com/snowflakedb/snowflake-jdbc/blob/master/src/main/java/net/snowflake/client/jdbc/SnowflakeChunkDownloader.java#L892

What is your Snowflake account identifier, if any?

sfc-gh-spanaite commented 1 year ago

Thanks for raising this feature request with us. We'll review internally (no estimated timeline for a response due to other priorities).

aiguofer commented 1 year ago

We're interested in this as well. We're currently converting the JDBC results back to Arrow for a variety of JDBC connectors, but we'd love to keep the data in Arrow format the entire time if possible.

It seems there's also been others looking to do the same: https://stackoverflow.com/questions/65997340/how-can-i-retrieve-data-in-arrow-format-when-querying-snowflake-in-java

carlossc commented 3 months ago

At Denodo, we are also interested on this. We would like to improve our existing Snowflake connector to take advantage of this.