vincenzobaz / spark-scala3

Apache License 2.0
89 stars 15 forks source link

udfs don't work with `Option`s in case class #36

Closed joan38 closed 1 year ago

joan38 commented 1 year ago

I get the following error when I return a case class with Options in it:

scala.MatchError: ObjectType(class java.lang.String) (of class org.apache.spark.sql.types.ObjectType)

This test reproduces the issue:

  case class DataWithX2(name: Option[String], x: Int)

  test("Option in case class") {
    val input =
      Seq(DataWithPos("zero", 0, 0, 0), DataWithPos("something", 1, 2, 3))
    val df = input.toDF()
    df.createOrReplaceTempView("data")
    val dataxx = udf((name: String, x: Int) => DataWithX2(Option(name), 2 * x))
    val res = spark.sql("SELECT * from data")
      .withColumn("d", dataxx(col("name"), col("x")))
      .collect().toList
    println("res =====")
    println(res)
//    assert(res.size == res.distinct.size)
  }
joan38 commented 1 year ago

I can see a // TODO: Nullable field in EncoderDerivation.scala