Unable to supersede IdentityToZeroTransformation and NullToZeroTransformation

What went wrong?

Both IdentityToZeroTransformation and NullToZeroTransformation are to handle special instances where LinearTransformer is used to map Numeric columns, but the values are either identical or all null. Ideally, these should be superseded when appending "regular" data by LinearTransformation instances. For now, it is not the case.

How to reproduce?

For IdentityToZeroTransformation for instance(and similarly for NullToZeroTransformation):

import org.apache.spark.sql.delta.DeltaLog
import io.qbeast.spark.delta.DeltaQbeastSnapshot
import io.qbeast.core.transform.IdentityToZeroTransformation
import spark.implicits._

case class IdentityCls(col1: String, col2: Int, col3: Double)

val idTestPath = "/tmp/test1/"
val identityData = (1 to 1000).map(_ => IdentityCls("1", 1, 1d)).toDS()
(identityData
    .write
    .mode("overwrite")
    .option("columnsToIndex", "col2")
    .option("cubeSize", "10000")
    .format("qbeast")
    .save(idTestPath)
)

(DeltaQbeastSnapshot(DeltaLog.forTable(spark, idTestPath)
  .update())
  .loadLatestRevision
  .transformations
  .head
  .isInstanceOf[IdentityToZeroTransformation]
) // true

// scala.MatchError at io.qbeast.core.transform.IdentityToZeroTransformation.transform(Transformation.scala:56)
((1 to 1000)
  .map(i => IdentityCls(s"$i", i, i.toDouble))
  .toDS()
  .write
  .mode("append")
  .format("qbeast")
  .save(idTestPath)
)

2. Branch and commit id:

main, f066acf

3. Spark version:

3.4.1

4. Hadoop version:

3.3.4

5. How are you running Spark?

Locally

Qbeast-io / qbeast-spark

Unable to supersede IdentityToZeroTransformation and NullToZeroTransformation #224