Closed mullerhai closed 12 months ago
Unfortunately, spark does not provide the field nullable to JdbcDialect https://github.com/housepower/ClickHouse-Native-JDBC/blob/0d5ee97e2dc1ead0d86f23928f71ef43c4834fc3/clickhouse-integration/clickhouse-integration-spark/src/main/scala/org/apache/spark/sql/jdbc/ClickHouseDialect.scala#L100-L104
Unfortunately, spark does not provide the field nullable to JdbcDialect https://github.com/housepower/ClickHouse-Native-JDBC/blob/0d5ee97e2dc1ead0d86f23928f71ef43c4834fc3/clickhouse-integration/clickhouse-integration-spark/src/main/scala/org/apache/spark/sql/jdbc/ClickHouseDialect.scala#L100-L104
How do you think overwrite with types.null()
override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
case StringType => Some(JdbcType("String", Types.VARCHAR))
// ClickHouse doesn't have the concept of encodings. Strings can contain an arbitrary set of bytes,
// which are stored and output as-is.
// See detail at https://clickhouse.tech/docs/en/sql-reference/data-types/string/
case BinaryType => Some(JdbcType("String",Types.NULL(Types.BINARY)))
case BooleanType => Some(JdbcType("UInt8",Types.NULL( Types.BOOLEAN)))
case ByteType => Some(JdbcType("Int8", Types.NULL(Types.TINYINT)))
case ShortType => Some(JdbcType("Int16", Types.NULL(Types.SMALLINT)))
case IntegerType => Some(JdbcType("Int32",Types.NULL( Types.INTEGER)))
case LongType => Some(JdbcType("Int64",Types.NULL( Types.BIGINT)))
case FloatType => Some(JdbcType("Float32",Types.NULL( Types.FLOAT)))
case DoubleType => Some(JdbcType("Float64",Types.NULL( Types.DOUBLE)))
case t: DecimalType => Some(JdbcType(s"Decimal(${t.precision},${t.scale})", Types.NULL(Types.DECIMAL)))
case DateType => Some(JdbcType("Date",Types.NULL( Types.DATE)))
case TimestampType => Some(JdbcType("DateTime",Types.NULL( Types.TIMESTAMP)))
How do you think overwrite with types.null()
The point here is that we can't get the nullable
of catalyst type dt
, so for IntegerType
we don't know how mapping it to ClickHouse type, Int32
or Nullable(Int32)
?
Currently, we mapping IntegerType
to Int32
, because Nullable
is not suit for some sort key columns. And, as you said, we can't got an expected result for those nullable columns.
Due to limitation said above, we don't recommend you heavily depends on this auto create table feature, a workaround here is auto create table and get the DDL by show create table xxx
, then tune DDL manually.
Consider the limitation of JDBC DataSource API, we are planning to build a native ClickHouse Connector based on DataSourceV2 API, it's a long term solution and we don't have a ETA for this feature.
do we support array fields with spark create ck table ?
but I meet error Exception in thread "main" java.lang.IllegalArgumentException: Can't get JDBC type for array
Not yet
Exception in thread "main" java.lang.IllegalArgumentException: Can't get JDBC type for array
Exception in thread "main" java.lang.IllegalArgumentException: Can't get JDBC type for array
if i want to get JDBC type for array,what i should to do?thank you~
Please provide a specific case
Environment
Hi , I want to insert data use spark with clickhouse, and I has use libraryDependencies += "com.github.housepower" %% "clickhouse-integration-spark" % "2.5.2", it is best tools, and I have meet error ,when my data has null column ,I think when I insert to set the column struct type ,it I has the columns data type schema maybe it is ok,could we has some option to support infer nullable type data to create ck table with spark ,insert to clickhouse with our tools
Error logs
Steps to reproduce
Other descriptions