apache / doris-spark-connector

Spark Connector for Apache Doris
https://doris.apache.org/
Apache License 2.0
74 stars 90 forks source link

[Bug] Unable to correctly recognize time partition fields #204

Closed tuoluzhe8521 closed 1 month ago

tuoluzhe8521 commented 1 month ago

Search before asking

Version

doris-spark-connector:1.3.0-1.3.2 doris:2.0 hive:3.1.3 hadoop:3.3.4 spark:3.3.1

What's Wrong?

spark-sql (default)> CREATE TEMPORARY VIEW dwd_test

USING doris OPTIONS( 'table.identifier'='dw_dwd.dwd_test', 'fenodes'='xxx:8030', 'user'='xxxx', 'password'='xxx', 'sink.properties.format' = 'json' ); Response code Time taken: 3.393 seconds spark-sql (default)> select * from dwd_test where dt ='2024-01-02' limit 3; 14:07:18.625 [main] ERROR org.apache.doris.spark.sql.ScalaDorisRowRDD - Doris FE's response cannot map to schema. res: {"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2021 AS DATETIME) in expression: (CAST(dt AS DATETIME) = CAST(2021 AS DATETIME))","status":400} org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "exception" (class org.apache.doris.spark.rest.models.QueryPlan), not marked as ignorable (3 known properties: "partitions", "status", "opaqued_query_plan"]) at [Source: (String)"{"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2021 AS DATETIME) in expression: (CAST(dt AS DATETIME) = CAST(2021 AS DATETIME))","status":400}"; line: 1, column: 15] (through reference chain: org.apache.doris.spark.rest.models.QueryPlan["exception"]) at org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:1127) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:2036) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1700) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1678) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:320) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:177) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4674) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3629) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3597) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rest.RestService.getQueryPlan(RestService.java:284) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rest.RestService.findPartitions(RestService.java:261) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rdd.AbstractDorisRDD.dorisPartitions$lzycompute(AbstractDorisRDD.scala:58) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rdd.AbstractDorisRDD.dorisPartitions(AbstractDorisRDD.scala:57) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rdd.AbstractDorisRDD.getPartitions(AbstractDorisRDD.scala:35) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:476) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:451) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:76) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$2(SparkSQLDriver.scala:69) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:69) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:384) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:504) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:498) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at scala.collection.Iterator.foreach(Iterator.scala:943) ~[scala-library-2.12.15.jar:?] at scala.collection.Iterator.foreach$(Iterator.scala:943) ~[scala-library-2.12.15.jar:?] at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) ~[scala-library-2.12.15.jar:?] at scala.collection.IterableLike.foreach(IterableLike.scala:74) ~[scala-library-2.12.15.jar:?] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) ~[scala-library-2.12.15.jar:?] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) ~[scala-library-2.12.15.jar:?] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:498) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:286) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_212] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_212] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_212] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_212] at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ~[spark-core_2.12-3.3.1.jar:3.3.1] 14:07:18.643 [main] ERROR org.apache.spark.sql.hive.thriftserver.SparkSQLDriver - Failed in [select * from dwd_cc_trade_pay_success_di where dt ='2024-01-02' limit 3] org.apache.doris.spark.exception.DorisException: Doris FE's response cannot map to schema. res: {"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2021 AS DATETIME) in expression: (CAST(dt AS DATETIME) = CAST(2021 AS DATETIME))","status":400} at org.apache.doris.spark.rest.RestService.getQueryPlan(RestService.java:292) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rest.RestService.findPartitions(RestService.java:261) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rdd.AbstractDorisRDD.dorisPartitions$lzycompute(AbstractDorisRDD.scala:58) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rdd.AbstractDorisRDD.dorisPartitions(AbstractDorisRDD.scala:57) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rdd.AbstractDorisRDD.getPartitions(AbstractDorisRDD.scala:35) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) ~[spark-core_2.12-3.3.1.jar:3.3.1] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:476) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:451) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:76) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$2(SparkSQLDriver.scala:69) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) ~[spark-sql_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:69) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:384) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:504) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:498) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at scala.collection.Iterator.foreach(Iterator.scala:943) ~[scala-library-2.12.15.jar:?] at scala.collection.Iterator.foreach$(Iterator.scala:943) ~[scala-library-2.12.15.jar:?] at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) ~[scala-library-2.12.15.jar:?] at scala.collection.IterableLike.foreach(IterableLike.scala:74) ~[scala-library-2.12.15.jar:?] at scala.collection.IterableLike.foreach$(IterableLike.scala:73) ~[scala-library-2.12.15.jar:?] at scala.collection.AbstractIterable.foreach(Iterable.scala:56) ~[scala-library-2.12.15.jar:?] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:498) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:286) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) ~[spark-hive-thriftserver_2.12-3.3.1.jar:3.3.1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_212] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_212] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_212] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_212] at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) ~[spark-core_2.12-3.3.1.jar:3.3.1] at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ~[spark-core_2.12-3.3.1.jar:3.3.1] Caused by: org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "exception" (class org.apache.doris.spark.rest.models.QueryPlan), not marked as ignorable (3 known properties: "partitions", "status", "opaqued_query_plan"]) at [Source: (String)"{"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2021 AS DATETIME) in expression: (CAST(dt AS DATETIME) = CAST(2021 AS DATETIME))","status":400}"; line: 1, column: 15] (through reference chain: org.apache.doris.spark.rest.models.QueryPlan["exception"]) at org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:1127) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:2036) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1700) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1678) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:320) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:177) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4674) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3629) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3597) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] at org.apache.doris.spark.rest.RestService.getQueryPlan(RestService.java:284) ~[spark-doris-connector-3.3_2.12-1.3.2.jar:1.4.0-SNAPSHOT] ... 55 more org.apache.doris.spark.exception.DorisException: Doris FE's response cannot map to schema. res: {"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2021 AS DATETIME) in expression: (CAST(dt AS DATETIME) = CAST(2021 AS DATETIME))","status":400} at org.apache.doris.spark.rest.RestService.getQueryPlan(RestService.java:292) at org.apache.doris.spark.rest.RestService.findPartitions(RestService.java:261) at org.apache.doris.spark.rdd.AbstractDorisRDD.dorisPartitions$lzycompute(AbstractDorisRDD.scala:58) at org.apache.doris.spark.rdd.AbstractDorisRDD.dorisPartitions(AbstractDorisRDD.scala:57) at org.apache.doris.spark.rdd.AbstractDorisRDD.getPartitions(AbstractDorisRDD.scala:35) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49) at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.rdd.RDD.partitions(RDD.scala:288) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:476) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:451) at org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:76) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$2(SparkSQLDriver.scala:69) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:69) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:384) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:504) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:498) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:498) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:286) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "exception" (class org.apache.doris.spark.rest.models.QueryPlan), not marked as ignorable (3 known properties: "partitions", "status", "opaqued_query_plan"]) at [Source: (String)"{"exception":"errCode = 2, detailMessage = Incorrect datetime value: CAST(2021 AS DATETIME) in expression: (CAST(dt AS DATETIME) = CAST(2021 AS DATETIME))","status":400}"; line: 1, column: 15] (through reference chain: org.apache.doris.spark.rest.models.QueryPlan["exception"]) at org.apache.doris.shaded.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61) at org.apache.doris.shaded.com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:1127) at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:2036) at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1700) at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1678) at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:320) at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:177) at org.apache.doris.shaded.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323) at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4674) at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3629) at org.apache.doris.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3597) at org.apache.doris.spark.rest.RestService.getQueryPlan(RestService.java:284) ... 55 more

What You Expected?

i can correctly query data from doris using date format the filter field is partition fields

How to Reproduce?

No response

Anything Else?

i can correctly query data from doris using date format when the filter field is not partition fields

Are you willing to submit PR?

Code of Conduct

tuoluzhe8521 commented 1 month ago

i can correctly query data from doris like this: select * from dwd_test where date_format(dt,'yyyyMMdd') ='20240102' limit 3; But it may not recognize partitions and scan the entire table

gnehil commented 1 month ago

Please post the create table statement, I'll try to reproduce it.

tuoluzhe8521 commented 1 month ago

Please post the create table statement, I'll try to reproduce it. CREATE TABLE dwd_test ( dt date NULL COMMENT '处理日期', id varchar(64) NULL COMMENT '充值订单ID', pay_success_time datetime NULL COMMENT '订单支付成功时间', user_id varchar(64) NULL COMMENT '用户ID', amount int(11) NULL COMMENT '金额', app_id varchar(255) NULL COMMENT 'appId', body varchar(255) NULL COMMENT 'body', channel varchar(64) NULL COMMENT '支付渠道', currency varchar(32) NULL COMMENT '币种', description varchar(255) NULL COMMENT '充值描述', extra varchar(512) NULL, metadata varchar(255) NULL, subject varchar(255) NULL, ip varchar(64) NULL COMMENT '用户IP', order_no varchar(64) NULL COMMENT '订单ID', pay_dts int(11) NULL COMMENT '充值订单创建时间', pay_id varchar(128) NULL COMMENT '支付关联id', is_deleted int(11) NULL COMMENT '是否删除', is_test int(11) NULL COMMENT '是否测试' ) ENGINE=OLAP UNIQUE KEY(dt, id) COMMENT '交易域充值成功事实表' PARTITION BY RANGE(dt) (PARTITION p202312 VALUES [('0000-01-01'), ('2024-01-01')), PARTITION p202401 VALUES [('2024-01-01'), ('2024-02-01')), PARTITION p202402 VALUES [('2024-02-01'), ('2024-03-01')), PARTITION p202403 VALUES [('2024-03-01'), ('2024-04-01')), PARTITION p202404 VALUES [('2024-04-01'), ('2024-05-01')), PARTITION p202405 VALUES [('2024-05-01'), ('2024-06-01')), PARTITION p202406 VALUES [('2024-06-01'), ('2024-07-01')), PARTITION p202407 VALUES [('2024-07-01'), ('2024-08-01'))) DISTRIBUTED BY HASH(id) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 3", "is_being_synced" = "false", "dynamic_partition.enable" = "true", "dynamic_partition.time_unit" = "MONTH", "dynamic_partition.time_zone" = "Asia/Shanghai", "dynamic_partition.start" = "-2147483648", "dynamic_partition.end" = "1", "dynamic_partition.prefix" = "p", "dynamic_partition.replication_allocation" = "tag.location.default: 3", "dynamic_partition.buckets" = "3", "dynamic_partition.create_history_partition" = "false", "dynamic_partition.history_partition_num" = "-1", "dynamic_partition.hot_partition_num" = "0", "dynamic_partition.reserved_history_periods" = "NULL", "dynamic_partition.storage_policy" = "", "dynamic_partition.storage_medium" = "HDD", "dynamic_partition.start_day_of_month" = "1", "storage_format" = "V2", "compression" = "ZSTD", "enable_unique_key_merge_on_write" = "true", "light_schema_change" = "true", "disable_auto_compaction" = "false", "enable_single_replica_compaction" = "false" );

tuoluzhe8521 commented 1 month ago

Please post the create table statement, I'll try to reproduce it.

can you help me solve this problem? thank you

gnehil commented 1 month ago

Please post the create table statement, I'll try to reproduce it.

can you help me solve this problem? thank you

I haven't reproduced it, but you can search the keyword "receive SQL statement" in fe.log to see what the specific query obtained by FE is.

tuoluzhe8521 commented 1 month ago

this can solve it : https://github.com/apache/doris-spark-connector/pull/209/files