sodadata / soda-sql

Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html
https://docs.soda.io/
Apache License 2.0
59 stars 16 forks source link

[Spark] Variance of numeric column is zero `0.0` #151

Closed JCZuurmond closed 2 years ago

JCZuurmond commented 2 years ago

Describe the bug The variance show in the column info is zero for my numeric columns. It should be non-zero

To Reproduce

  1. Set-up soda to run with spark
  2. Set-up soda to send metrics to the cloud
  3. Create a table with a numeric column
  4. Run a scan for that table.
  5. See the column information, it shows the variance is zero

Context NA

OS: NA Python Version: 3.9 Soda SQL Version: 2.1.0b18 Warehouse Type: Spark

vijaykiran commented 2 years ago

Is it shown as zero only in Cloud or in the output of Soda SQL as well?

JCZuurmond commented 2 years ago

I have only checked it in the cloud

vijaykiran commented 2 years ago

There is a test for variance in Soda SQL and It is not failing. Do you have a sample of numbers so I can quickly try out ? perhaps the number is too small and it is rounded to zero?

JCZuurmond commented 2 years ago

Closing this issues until I found some data to create a test for this.