Closed eemhu closed 5 months ago
Note: Check commands like isnull
and isnotnull
after this change.
Based on some research, I think the following is the way to go:
Basically need to go through all the commands that can produce a "empty row" and make sure it is of the Spark null spec.
Describe the bug Currently different commands may return different types of nulls: The string "null", empty string "" or Spark's NULL field.
Expected behavior All nulls should be the same to avoid confusion and issues in processing.
How to reproduce For example "dedup" command uses "null" for null values, but "spath" uses empty string.
Screenshots
Software version
pth_10 4.16.0
Desktop (please complete the following information if relevant):
Additional context https://sparkbyexamples.com/spark/spark-replace-empty-value-with-null-on-dataframe/ To be fixed in Spark3 version as the null specification changes from 2.4 -> 3.x