Open deenkar opened 2 years ago
Test In AnalyzerTests ware also failing.
Pearson correlation->
"yield 1.0 for maximal conditionally informative columns" in withSparkSession { sparkSession =>
val df = getDfWithConditionallyInformativeColumns(sparkSession)
Correlation("att1", "att2").calculate(df) shouldBe DoubleMetric(
Entity.Mutlicolumn,
"Correlation",
"att1,att2",
Success(1.0)
)
DoubleMetric(Mutlicolumn,Correlation,att1,att2,Failure(com.amazon.deequ.analyzers.runners.MetricCalculationRuntimeException: java.lang.ClassCastException: java.lang.Double cannot be cast to org.apache.spark.sql.Row)) was not equal to DoubleMetric(Mutlicolumn,Correlation,att1,att2,Success(1.0)) ScalaTestFailureLocation: org.scalatest.matchers.MatchersHelper$ at (AnalyzerTests.scala:664) Expected :DoubleMetric(Mutlicolumn,Correlation,att1,att2,Success(1.0)) Actual :DoubleMetric(Mutlicolumn,Correlation,att1,att2,Failure(com.amazon.deequ.analyzers.runners.MetricCalculationRuntimeException: java.lang.ClassCastException: java.lang.Double cannot be cast to org.apache.spark.sql.Row))
Hi @TammoR
When running a Correlation analyzer on tammruka/2.0.0-spark-3.2.0 it gives the below error and does not compute Correlation.
case class NewNumRawData(totalNumber: Integer, count: Integer)
Is failing with below error in analysis result AnalyzerContext(Map(Correlation(totalNumber,count,None) -> DoubleMetric(Mutlicolumn,Correlation,totalNumber,count,Failure(com.amazon.deequ.analyzers.runners.MetricCalculationRuntimeException: java.lang.ClassCastException: java.lang.Double cannot be cast to org.apache.spark.sql.Row))))