Closed alex-krash closed 5 years ago
Support for the SQLTransformer
class was introduced very recently, so some bumps in the road are to be expected.
This issue is specifically about "validating" data types in arithmetic expressions. It's probably safe to comment out/remove this validation logic altogether. However, the right thing to do would be to perform the "resolution" of the parsed Apache Spark SQL statement.
I couldn't find the right Apache Spark API/entry point for that. Basically, I expect there to be some org.apache.spark.sql.catalyst.plans.logical.LogicalPlan#resolveAll(StructType)
method - visits the AST and adds data type information to individual AST nodes based on the data schema.
@alex-krash Do you know how to turn a "raw" LogicalPlan
object into a "resolved" LogicalPlan
object?
Do you know how to turn a "raw" LogicalPlan object into a "resolved" LogicalPlan object? @vruusmann , I am trying to figure out, how resolving can be implemented, but no luck yet :( I think that the same error when getting dataType will be with all Unresolved* instances.
It looks like the parsed plan is within SQLTransformer itself: I don't have an elegant solution for now.
@Since("1.6.0")
override def transformSchema(schema: StructType): StructType = {
val spark = SparkSession.builder().getOrCreate()
val dummyRDD = spark.sparkContext.parallelize(Seq(Row.empty))
val dummyDF = spark.createDataFrame(dummyRDD, schema)
val tableName = Identifiable.randomUID(uid)
val realStatement = $(statement).replace(tableIdentifier, tableName)
dummyDF.createOrReplaceTempView(tableName)
val outputSchema = spark.sql(realStatement).schema
// spark.sql(realStatement).queryExecution.analyzed -- here is an analyzed plan
spark.catalog.dropTempView(tableName)
outputSchema
}
Hello! I got a bug, when using SQLTransformer (trying to implement engineered features via it):
An error:
Spark 2.3.0 (tried also 2.3.2 - the same). Full example goes here: https://gist.github.com/alex-krash/46ec1947e9ea4f0d7ce9acb63c512e09 Is there a workaround for dealing with it?