Open RealDeanZhao opened 3 weeks ago
Hmm, Postgres NUMERIC fields without a fixed precision/scale can't actually be supported by Arrow because those are variable/unlimited precision and Arrow assumes a fixed precision per field.
For BigQuery, we need to read the type correctly.
Note that we have been considering a JNI bridge to use the native ADBC drivers for both these databases. That should be faster than the JDBC driver and should handle these cases better as the drivers have had more individual attention for each database's quirks (vs for JDBC which just tries to generically adapt the results from JDBC).
Hmm, Postgres NUMERIC fields without a fixed precision/scale can't actually be supported by Arrow because those are variable/unlimited precision and Arrow assumes a fixed precision per field.
For BigQuery, we need to read the type correctly.
Note that we have been considering a JNI bridge to use the native ADBC drivers for both these databases. That should be faster than the JDBC driver and should handle these cases better as the drivers have had more individual attention for each database's quirks (vs for JDBC which just tries to generically adapt the results from JDBC).
https://arrow.apache.org/cookbook/java/jdbc.html#id5
Is it possible to use a custom JdbcToArrowConfig to avoid this issue? Seems that the JdbcArrowReader use a default config.
JdbcArrowReader(BufferAllocator allocator, ResultSet resultSet, @Nullable Schema overrideSchema) throws AdbcException {
super(allocator);
JdbcToArrowConfig config = makeJdbcConfig(allocator);
What happened?
JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL
Stack Trace
Invalid Input Error: arrow_scan: get_next failed(): java.lang.RuntimeException: Error occurred while getting next schema root. at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:190) at org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader.loadNextBatch(JdbcArrowReader.java:87) at org.apache.arrow.c.ArrayStreamExporter$ExportedArrayStreamPrivateData.getNext(ArrayStreamExporter.java:66) Caused by: java.lang.RuntimeException: Error occurred while consuming data. at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:112) at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:163) at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:183) ... 2 more Caused by: java.lang.UnsupportedOperationException: Cannot get simple type for type DECIMAL at org.apache.arrow.vector.types.Types$MinorType.getType(Types.java:815) at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:49) at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:98) ... 4 more À
How can we reproduce the bug?
Numeric field without scale and precision will cause the error
Also tried debug the code and found pg jdbc driver getBigDecimal will return a BigDecimal with precision 1. This will cause the actual error: "BigDecimal precision cannot be greater than that in the Arrow vector'
Environment/Setup
No response