Open lyy-pineapple opened 1 month ago
I have tried the similar sql with spark 3.5.1, and I didn't met the problem. Would you please give the detail build info like below?
Backend: Velox
Backend Branch: HEAD
Backend Revision: fb3eda9f85de36de59f842f65270bc6749e9bc51
Backend Revision Time: 2024-07-19 00:14:09 +0000
GCC Version: Apple clang version 15.0.0 (clang-1500.3.9.4)
Gluten Branch: master
Gluten Build Time: 2024-07-24T06:57:12Z
Gluten Repo URL: https://github.com/jackylee-ch/gluten.git
Gluten Revision: 2e47b580a8eff8615b152d16285448eceec71094
Gluten Revision Time: 2024-07-23 13:15:17 +0800
Gluten Version: 1.2.0-SNAPSHOT
Hadoop Version: 3.3.4
Java Version: 17
Scala Version: 2.12.15
Spark Version: 3.5.1
I have tried the similar sql with spark 3.5.1, and I didn't met the problem. Would you please give the detail build info like below?
Backend: Velox Backend Branch: HEAD Backend Revision: fb3eda9f85de36de59f842f65270bc6749e9bc51 Backend Revision Time: 2024-07-19 00:14:09 +0000 GCC Version: Apple clang version 15.0.0 (clang-1500.3.9.4) Gluten Branch: master Gluten Build Time: 2024-07-24T06:57:12Z Gluten Repo URL: https://github.com/jackylee-ch/gluten.git Gluten Revision: 2e47b580a8eff8615b152d16285448eceec71094 Gluten Revision Time: 2024-07-23 13:15:17 +0800 Gluten Version: 1.2.0-SNAPSHOT Hadoop Version: 3.3.4 Java Version: 17 Scala Version: 2.12.15 Spark Version: 3.5.1
The main branch of Velox supports the slice function with bigint parameters. I'm not sure if other array functions have this issue. It can be reproduced in branch 1.1. sql:
with t as ( select
/\*+ repartition(2) \*/ a FROM values (array('a', 'b', 'c', 'd')), (array('a', 'b', 'c', 'd')) ,array('a', 'b', 'c', 'd') test(a) )
SELECT explode(slice(a, 2, 2)) from t
Or maybe reproduce it in main branch or branch 1.2? Branch 1.1 has been cut for a long time, the problem met in branch 1.1 may not happen in main branch or branch 1.2.
with t as ( select
/*+ repartition(2) */ a FROM values (array('a', 'b', 'c', 'd')), (array('a', 'b', 'c', 'd')) ,array('a', 'b', 'c', 'd') test(a) ) SELECT explode(slice(a, 2, 2)) from t I have tried with main branch. It works ok and we got the follow fallback messages.4/07/25 11:02:04 INFO GlutenFallbackReporter: Validation failed for plan: Project[QueryId=0], due to: Native validation failed: Optional[Validation failed due to exception caught at file:SubstraitToVeloxPlanValidator.cc line:1306 function:validate, thrown from file:ExprCompiler.cpp line:465 function:compileRewrittenExpression, reason:Scalar function name not registered: slice, called with arguments: (ARRAY<VARCHAR>, INTEGER, INTEGER).]; Native validation failed: Optional[Validation failed due to exception caught at file:SubstraitToVeloxPlanValidator.cc line:1306 function:validate, thrown from file:ExprCompiler.cpp line:465 function:compileRewrittenExpression, reason:Scalar function name not registered: slice, called with arguments: (ARRAY<VARCHAR>, INTEGER, INTEGER).].
Backend
VL (Velox)
Bug description
Validate GenerateRel did not attempt to compile the generator. If the generator is not supported, Spark SQL will not fallback to GenerateRel, which will result in job failure. For example v1.1.1, sql:
spark job willl failed
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
No response