delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.56k stars 1.7k forks source link

[Kernel][Testing] Add more thorough testing for DataType.toJson #2313

Open allisonport-db opened 11 months ago

allisonport-db commented 11 months ago

Feature request

Which Delta project/connector is this regarding?

Overview

All of our DataType classes provide toJson which serializes the type following the delta protocol schema serialization rules https://github.com/delta-io/delta/blob/master/PROTOCOL.md#schema-serialization-format. We use this method to serialize the physical and logical schemas in ScanStateRow.

toJson is only tested by the 2 round-trip tests in DefaultJsonHandlerSuite. We should add more thorough tests for toJson.

nitinmahawadiwar commented 4 months ago

Hello @allisonport-db , I would like to take this issue. I am new to the Delta and worked on Spark/Java earlier.

Can I request you to guide me on this ?

vkorukanti commented 4 months ago

Hi @nitinmahawadiwar, currently DataTypeParserSuite.scala has test cases for JSON string to DataType. We need to test the other direction DataType to JSON string. One way is to modify the tests in the suite like below. Basically make the tests round trip "JSON string" -> DataType -> "JSON string"

-  private def checkDataType(dataTypeString: String, expectedDataType: DataType): Unit = {
+  private def checkDataTypeRoundTrip(dataTypeString: String, expectedDataType: DataType): Unit = {
     test(s"parseDataType: parse ${dataTypeString.replace("\n", "")}") {
-      assert(parse(dataTypeString) === expectedDataType)
+      val dataType = parse(dataTypeString)
+      assert(dataType === expectedDataType)
+      assert(dataType.toJson === dataTypeString)
     }
   }

Let me know if you any further qns. At the end of this, we may want to rename DataTypeParser to DataTypeSerDe, but lets get the tests added first.

nitinmahawadiwar commented 4 months ago

Hi @vkorukanti , thanks for briefing me. I am going through the project setup & code understanding in my local environment. I will connect here with you again.