Open birschick-bq opened 3 months ago
This seems to overlap partially with #1704 in that returning the Arrow types for each column would let us know definitively what to expect when fetching the data, including the type and nullability of the columns.
data_type/sql_data_type were taken from Flight SQL which in turn inherited from JDBC (@jduo correct me if I'm wrong): https://docs.oracle.com/en/java/javase/11/docs/api/java.sql/java/sql/DatabaseMetaData.html#getColumns(java.lang.String,java.lang.String,java.lang.String,java.lang.String)
DATA_TYPE int => SQL type from java.sql.Types SQL_DATA_TYPE int => unused
The type codes are based on JDBC Types constant and ODBC SQL_* types (which are usually the same)
There are a few cases where they aren't such as JDBC having ARRAY and ODBC having interval types.
The xdbc_sql_data_type field differs in that it can store a database-specific type code to give a more specific type (for example a DECIMAL that is actually a currency vs. an arbitrary high precision number).
What feature or improvement would you like to see?
As we develop drivers for various data sources, we find that consumers of the driver not only need a reliable API, but also reliable metadata results to make consuming different drivers less data source specific.
While consumers can use
GetTableSchema
, it may not provide enough information about the data source's unique column properties.So consumers will use GetObjects to get more information about the native metadata of the data source. However, there is a large amount of flexibility afforded the values in the COLUMN_SCHEMA structure.
I'd like to propose a more restrictive or suggestive description of the field contents so that consuming this information can be more portable. I believe the "agnostic manner" intention is to use JDBC/ODBC values, if possible. Or reading into this more, values that can be reliable understood by the consumer of the call. The other possibility is to add new fields to the structure which would follow more restrictive specifications.
Examples:
xdbc_type_name
should contain string values taken from either or both the JDBC JDBCType enumeration or ODBC identifiers - in a ADBC defined list of acceptable values.xdbc_data_type
andxdbc_sql_data_type
are not clearly defined nor is their difference (if any). It could be thatxdbc_data_type
is defined by the JDBC values andxdbc_sql_data_type
could be defined by the ODBC values.Still, carrying around these legacy value is not ideal and we should likely associate an ADBC-defined value to one or both of these two fields
xdbc_nullable
as int16 - should be explicitly defined as0
(not nullable) and1
(nullable) and2
(unknown) ornull
(unsupported by data source)xdbc_is_nullable
- should be explicitly defined as"NO"
,"YES"
,""
(unknown) ornull
(unsupported by data source)The result of this discussion should be