Azure / azure-kusto-spark

Apache Spark Connector for Azure Kusto
Apache License 2.0
77 stars 35 forks source link

* Fix decimals to new scales 38,18 the default in Spark #258

Closed ag-ramachandran closed 2 years ago

ag-ramachandran commented 2 years ago

Pull Request Description

Kusto decimals are effectively floating point while Scala/Java are fixed point. Large decimals in Kusto are truncated or are marked as null in Spark because we use 20,14 as Precision and scale.

The PR addresses this by using the Spark default scale (38,18) . This is same Precision and Scale that in Parquet exports in ADX as well

Fixes:

ag-ramachandran commented 2 years ago

Tested on Spark 2.4 and Spark 3.0

[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ connector-samples ---
[INFO] No tests to run.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for azure-kusto-spark 3.1.2:
[INFO]
[INFO] azure-kusto-spark .................................. SUCCESS [  6.354 s]
[INFO] Spark Kusto connector .............................. SUCCESS [07:08 min]
[INFO] connector-samples .................................. SUCCESS [  8.082 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  07:23 min
[INFO] Finished at: 2022-09-13T18:18:06+05:30
[INFO] ------------------------------------------------------------------------
PS C:\Code\azure-kusto-spark> mvn dependency:tree|Select-String "spark"

PS C:\Code\azure-kusto-spark> mvn dependency:tree|Select-String "spark"

[WARNING] Some problems were encountered while building the effective model for com.microsoft.azure.kusto:kusto-spark_2.4_2.12:jar:3.1.2
[WARNING] 'artifactId' contains an expression but should be a constant. @ com.microsoft.azure.kusto:kusto-spark_${spark.version.major}_${scala.version.major}:${revision}, C:\Code\azure-kusto-spark\connector\pom.xml, line 5, column 17
[INFO] azure-kusto-spark                                                  [pom]
[INFO] Spark Kusto connector                                              [jar]
[INFO] ------------< com.microsoft.azure.kusto:azure-kusto-spark >-------------
[INFO] Building azure-kusto-spark 3.1.2                                   [1/3]
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ azure-kusto-spark ---
[INFO] com.microsoft.azure.kusto:azure-kusto-spark:pom:3.1.2
[INFO] -----------< com.microsoft.azure.kusto:kusto-spark_2.4_2.12 >-----------
[INFO] Building Spark Kusto connector 3.1.2                               [2/3]
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ kusto-spark_2.4_2.12 ---
[INFO] com.microsoft.azure.kusto:kusto-spark_2.4_2.12:jar:3.1.2
[INFO] +- org.apache.spark:spark-sql_2.12:jar:2.4.1:provided
[INFO] |  +- org.apache.spark:spark-sketch_2.12:jar:2.4.1:provided
[INFO] |  +- org.apache.spark:spark-core_2.12:jar:2.4.1:provided
[INFO] |  |  +- org.apache.spark:spark-launcher_2.12:jar:2.4.1:provided
[INFO] |  |  +- org.apache.spark:spark-kvstore_2.12:jar:2.4.1:provided
[INFO] |  |  +- org.apache.spark:spark-network-common_2.12:jar:2.4.1:provided
[INFO] |  |  +- org.apache.spark:spark-network-shuffle_2.12:jar:2.4.1:provided
[INFO] |  +- org.apache.spark:spark-tags_2.12:jar:2.4.1:provided
[INFO] |  \- org.spark-project.spark:unused:jar:1.0.0:provided
[INFO] +- org.apache.spark:spark-catalyst_2.12:jar:2.4.1:provided
[INFO] |  +- org.apache.spark:spark-unsafe_2.12:jar:2.4.1:provided
[INFO] +- com.microsoft.azure.kusto:kusto-spark_3.0_2.12:jar:3.1.2:compile
[INFO] +- com.microsoft.azure:azure-eventhubs-spark_2.12:jar:2.3.17:compile
[INFO] +- org.apache.spark:spark-sql_2.12:jar:2.4.1:compile
[INFO] |  +- org.apache.spark:spark-sketch_2.12:jar:2.4.1:compile
[INFO] |  +- org.apache.spark:spark-core_2.12:jar:2.4.1:compile
[INFO] |  |  +- org.apache.spark:spark-launcher_2.12:jar:2.4.1:compile
[INFO] |  |  +- org.apache.spark:spark-kvstore_2.12:jar:2.4.1:compile
[INFO] |  |  +- org.apache.spark:spark-network-common_2.12:jar:2.4.1:compile
[INFO] |  |  +- org.apache.spark:spark-network-shuffle_2.12:jar:2.4.1:compile
[INFO] |  +- org.apache.spark:spark-tags_2.12:jar:2.4.1:compile
[INFO] |  \- org.spark-project.spark:unused:jar:1.0.0:compile
[INFO] \- org.apache.spark:spark-catalyst_2.12:jar:2.4.1:compile
[INFO]    +- org.apache.spark:spark-unsafe_2.12:jar:2.4.1:compile
[INFO] Reactor Summary for azure-kusto-spark 3.1.2:

On spark 3.0

[INFO] Reactor Summary for azure-kusto-spark 3.1.2:
[INFO]
[INFO] azure-kusto-spark .................................. SUCCESS [  5.632 s]
[INFO] Spark Kusto connector .............................. SUCCESS [07:14 min]
[INFO] connector-samples .................................. SUCCESS [  7.595 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  07:27 min
[INFO] Finished at: 2022-09-13T18:38:57+05:30
[INFO] ------------------------------------------------------------------------
PS C:\Code\azure-kusto-spark> mvn dependency:tree|Select-String "spark"             

[WARNING] Some problems were encountered while building the effective model for com.microsoft.azure.kusto:kusto-spark_3.0_2.12:jar:3.1.2
[WARNING] 'artifactId' contains an expression but should be a constant. @ com.microsoft.azure.kusto:kusto-spark_${spark.version.major}_${scala.version.major}:${revision}, C:\Code\azure-kusto-spark\connector\pom.xml, line 5, column 17
[INFO] azure-kusto-spark                                                  [pom]
[INFO] Spark Kusto connector                                              [jar]
[INFO] ------------< com.microsoft.azure.kusto:azure-kusto-spark >-------------
[INFO] Building azure-kusto-spark 3.1.2                                   [1/3]
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ azure-kusto-spark ---
[INFO] com.microsoft.azure.kusto:azure-kusto-spark:pom:3.1.2
[INFO] -----------< com.microsoft.azure.kusto:kusto-spark_3.0_2.12 >-----------
[INFO] Building Spark Kusto connector 3.1.2                               [2/3]
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ kusto-spark_3.0_2.12 ---
[INFO] com.microsoft.azure.kusto:kusto-spark_3.0_2.12:jar:3.1.2
[INFO] +- org.apache.spark:spark-sql_2.12:jar:3.0.1:provided
[INFO] |  +- org.apache.spark:spark-sketch_2.12:jar:3.0.1:provided
[INFO] |  +- org.apache.spark:spark-core_2.12:jar:3.0.1:provided
[INFO] |  |  +- org.apache.spark:spark-launcher_2.12:jar:3.0.1:provided
[INFO] |  |  +- org.apache.spark:spark-kvstore_2.12:jar:3.0.1:provided
[INFO] |  |  +- org.apache.spark:spark-network-common_2.12:jar:3.0.1:provided
[INFO] |  |  +- org.apache.spark:spark-network-shuffle_2.12:jar:3.0.1:provided
[INFO] |  +- org.apache.spark:spark-tags_2.12:jar:3.0.1:provided
[INFO] |  \- org.spark-project.spark:unused:jar:1.0.0:provided
[INFO] +- org.apache.spark:spark-catalyst_2.12:jar:3.0.1:provided
[INFO] |  +- org.apache.spark:spark-unsafe_2.12:jar:3.0.1:provided
[INFO] +- com.microsoft.azure.kusto:kusto-spark_3.0_2.12:jar:3.1.2:compile
[INFO] +- com.microsoft.azure:azure-eventhubs-spark_2.12:jar:2.3.17:compile
[INFO] +- org.apache.spark:spark-sql_2.12:jar:3.0.1:compile
[INFO] |  +- org.apache.spark:spark-sketch_2.12:jar:3.0.1:compile
[INFO] |  +- org.apache.spark:spark-core_2.12:jar:3.0.1:compile
[INFO] |  |  +- org.apache.spark:spark-launcher_2.12:jar:3.0.1:compile
[INFO] |  |  +- org.apache.spark:spark-kvstore_2.12:jar:3.0.1:compile
[INFO] |  |  +- org.apache.spark:spark-network-common_2.12:jar:3.0.1:compile
[INFO] |  |  +- org.apache.spark:spark-network-shuffle_2.12:jar:3.0.1:compile
[INFO] |  +- org.apache.spark:spark-tags_2.12:jar:3.0.1:compile
[INFO] |  \- org.spark-project.spark:unused:jar:1.0.0:compile
[INFO] \- org.apache.spark:spark-catalyst_2.12:jar:3.0.1:compile
[INFO]    +- org.apache.spark:spark-unsafe_2.12:jar:3.0.1:compile
[INFO] Reactor Summary for azure-kusto-spark 3.1.2:
[INFO] azure-kusto-spark .................................. SUCCESS [  6.743 s]
[INFO] Spark Kusto connector .............................. SUCCESS [  2.218 s]