AbsaOSS / hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark
Apache License 2.0
44 stars 13 forks source link

Fix compiler warning: "Unchecked type argument" #289

Closed kevinwallimann closed 1 year ago

kevinwallimann commented 1 year ago

Description

During compilation, the compiler warns that a type argument is unchecked because it is eliminated by type erasure. This is because in AbrisTestUtil, pattern matching is used with a generic classes, i.e. Option[T] (where T is the type parameter, Map[String, String] in this example) or Map[K, V], where K and V are the type parameters, namely K is String and V is Any. Because of type erasure, the type parameters won't be available at runtime and therefore the pattern will match not just Option[Map[String, String]], but any Option[T], e.g. Option[String], Option[Integer], etc., hence the warning that the type arguments are unchecked.

Stacktrace

[WARNING] hyperdrive/ingestor-default/src/test/scala/za/co/absa/hyperdrive/ingestor/implementation/testutils/abris/AbrisTestUtil.scala:25: warning: non-variable type argument Map[String,String] in type pattern Option[Map[String,String]] is unchecked since it is eliminated by erasure
[WARNING]       case value: Option[Map[String, String]] => value
[WARNING]                   ^
[WARNING] hyperdrive/ingestor-default/src/test/scala/za/co/absa/hyperdrive/ingestor/implementation/testutils/abris/AbrisTestUtil.scala:34: warning: non-variable type argument String in type pattern scala.collection.immutable.Map[String,Any] (the underlying of Map[String,Any]) is unchecked since it is eliminated by erasure
[WARNING]       case abrisConfig: Map[String, Any] => abrisConfig
[WARNING]                         ^
[WARNING] hyperdrive/ingestor-default/src/test/scala/za/co/absa/hyperdrive/ingestor/implementation/testutils/abris/AbrisTestUtil.scala:46: warning: non-variable type argument String in type pattern scala.collection.immutable.Map[String,Any] (the underlying of Map[String,Any]) is unchecked since it is eliminated by erasure
[WARNING]       case abrisConfig: Map[String, Any] => abrisConfig
[WARNING]                         ^

Details

In AbrisTestUtil.scala, there are three match case statements which cause the above warnings.

  def getSchemaRegistryConf(fromAvroConfig: FromAvroConfig): Option[Map[String, String]] = {
    from_avro(lit(1), fromAvroConfig).expr.productElement(2) match {
      case value: Option[Map[String, String]] => value
      case _ => throw new IllegalArgumentException("Test internal exception. Either the product index or the type of FromAvroConfig.schemaRegistryConf is wrong")
    }
  }

The problem is in this line: case value: Option[Map[String, String]] => value During compilation, the type is erased, hence during runtime the JVM only "sees" this: case value: Option => value. It is the intention of the code to throw an exception, if e.g. an Option[Integer] is passed in, but due to type erasure, the exception will not be thrown.

Solution proposal

A possible solution, as outlined in https://gist.github.com/jkpl/5279ee05cca8cc1ec452fc26ace5b68b#avoid-matching-on-generic-type-parameters, is to use a boxed type, e.g. a case class around the generic type. For example, one could create a case class like this

case class SchemaRegistryConf (value: Option[Map[String, String]])

and then rewrite the match case like this

      case conf: SchemaRegistryConf => conf.value

Like this, the type-check is effectively moved from runtime to compilation time, and from the match-case statement into the constructor of the case class.

Of course, this is not a general solution, but can only be done, if we control the input to the match case statement.

In the present case, if you look at the input, from_avro(lit(1), fromAvroConfig).expr.productElement(2), then we can notice that from_avro(lit(1), fromAvroConfig).expr actually has the type za.co.absa.abris.avro.sql.AvroDataToCatalyst, which is a case class that looks like this:

private[abris] case class AvroDataToCatalyst(
  child: Expression,
  abrisConfig: Map[String,Any],
  schemaRegistryConf: Option[Map[String,String]]
) extends UnaryExpression

It wraps around the Option[Map[String,String]], exactly as outlined above, so we can actually match against the AvroDataToCatalyst type instead of Option[Map[String,String]].

The only problem is that the AvroDataToCatalyst type is package private. To circumvent that, we can change the package of this test utils class to za.co.absa.abris.avro.sql. This is an ugly hack, and I would generally not approve of it to solve this problem, but since it is only test code and the abris library is also owned by us, I would accept this workaround.