apache / doris-spark-connector

Spark Connector for Apache Doris
https://doris.apache.org/
Apache License 2.0
79 stars 92 forks source link

[fix](compatible) Fix cast eror when select data from doris 2.0 #209

Closed liutang123 closed 3 months ago

liutang123 commented 3 months ago

Proposed changes

doris-spark-connector:1.3.2 doris:2.0 hive:3.1.3 hadoop:3.3.4 spark:3.3.1

Doris table:

CREATE TABLE `test_t` (
    ->   `dt` date NULL COMMENT '处理日期',
    ->   `id` varchar(64) NULL COMMENT '订单ID',
    ->   `user_id` varchar(64) NULL COMMENT '用户ID',
    ->   `amount` int(11) NULL COMMENT '原始订单金额',
    ->   `app_id` varchar(255) NULL COMMENT 'appId',
    ->   `body` varchar(255) NULL COMMENT 'body',
    ->   `channel` varchar(64) NULL COMMENT '支付渠道',
    ->   `currency` varchar(32) NULL COMMENT '币种',
    ->   `description` varchar(255) NULL COMMENT '充值描述',
    ->   `extra` varchar(512) NULL,
    ->   `metadata` varchar(255) NULL,
    ->   `subject` varchar(255) NULL,
    ->   `ip` varchar(64) NULL COMMENT '用户IP',
    ->   `order_no` varchar(64) NULL COMMENT '订单ID',
    ->   `pay_dts` int(11) NULL COMMENT '充值时间',
    ->   `pay_id` varchar(128) NULL COMMENT '支付关联id',
    ->   `is_deleted` int(11) NULL COMMENT '是否删除',
    ->   `is_test` int(11) NULL COMMENT '是否测试',
    ->   `refund_amount` int(11) NULL COMMENT '退款金额',
    ->   `recharge_id` varchar(64) NULL COMMENT '关联充值订单ID',
    ->   `refund_id` text NULL COMMENT '关联退款订单ID',
    ->   `refund_dts` int(11) NULL COMMENT '最后一次退款时间(binlog)'
    -> ) ENGINE=OLAP
    -> UNIQUE KEY(`dt`, `id`)
    -> COMMENT '交易域退款事实表'
    -> PARTITION BY RANGE(`dt`)
    -> (PARTITION p202312 VALUES [('0000-01-01'), ('2024-01-01')),
    -> PARTITION p202401 VALUES [('2024-01-01'), ('2024-02-01')),
    -> PARTITION p202402 VALUES [('2024-02-01'), ('2024-03-01')),
    -> PARTITION p202403 VALUES [('2024-03-01'), ('2024-04-01')),
    -> PARTITION p202404 VALUES [('2024-04-01'), ('2024-05-01')),
    -> PARTITION p202405 VALUES [('2024-05-01'), ('2024-06-01')),
    -> PARTITION p202406 VALUES [('2024-06-01'), ('2024-07-01')),
    -> PARTITION p202407 VALUES [('2024-07-01'), ('2024-08-01')))
    -> DISTRIBUTED BY HASH(`id`) BUCKETS 1
    -> PROPERTIES (
    -> "replication_allocation" = "tag.location.default: 1",
    -> "is_being_synced" = "false",
    -> "dynamic_partition.enable" = "true",
    -> "dynamic_partition.time_unit" = "MONTH",
    -> "dynamic_partition.time_zone" = "Asia/Shanghai",
    -> "dynamic_partition.start" = "-2147483648",
    -> "dynamic_partition.end" = "1",
    -> "dynamic_partition.prefix" = "p",
    -> "dynamic_partition.replication_allocation" = "tag.location.default: 1",
    -> "dynamic_partition.buckets" = "1",
    -> "dynamic_partition.create_history_partition" = "false",
    -> "dynamic_partition.history_partition_num" = "-1",
    -> "dynamic_partition.hot_partition_num" = "0",
    -> "dynamic_partition.reserved_history_periods" = "NULL",
    -> "dynamic_partition.storage_policy" = "",
    -> "dynamic_partition.storage_medium" = "HDD",
    -> "dynamic_partition.start_day_of_month" = "1",
    -> "storage_format" = "V2",
    -> "compression" = "ZSTD",
    -> "enable_unique_key_merge_on_write" = "true",
    -> "light_schema_change" = "true",
    -> "disable_auto_compaction" = "false",
    -> "enable_single_replica_compaction" = "false"
    -> );

spark sql:

CREATE
 TEMPORARY VIEW spark_doris
  USING doris
   OPTIONS(
  "table.identifier"="test.test_t",
    "fenodes"="127.0.0.1:8030",
   "user"="admin",
    "password"="");
select * from test_t where  dt = '2024-06-02' limit 10;

error message:

Caused by: org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "exception" (class org.apache.doris.spark.rest.models.QueryPlan), not marked as ignorable (3 known properties: "partitions", "status", "opaqued_query_plan"])
 at [Source: (String)"{"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2016 AS DATETIME) in expression: (CAST(`dt` AS DATETIME) = CAST(2016 AS DATETIME))","status":400}"; line: 1, column: 15] (through reference chain: org.apache.doris.spark.rest.models.QueryPlan["exception"])
    at org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61)
...

Issue Number: close #xxx

Problem Summary:

Describe the overview of changes.

Checklist(Required)

  1. Does it affect the original behavior: (Yes/No/I Don't know)
  2. Has unit tests been added: (Yes/No/No Need)
  3. Has document been added or modified: (Yes/No/No Need)
  4. Does it need to update dependencies: (Yes/No)
  5. Are there any changes that cannot be rolled back: (Yes/No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...