These failures are a result of #11711. When the JSON parser attempts to read integral struct members from a JSON file, if the parsing leads to an overflow, then the STRUCT column value is deemed null on Databricks 14.3 (i.e. withoutspark-rapids active). This behaviour differs from that exhibited by Apache Spark versions exceeding 3.4.1.
This commit breaks out the problematic JSON test rows into a separate file, whose read is tested in an xfail for Databricks 14.3. The remaining rows are tested on all versions.
Fixes #11533.
This commit addresses the test failures reported in #11533, for the following tests:
json_matrix_test.py::test_from_json_long_structs()
json_matrix_test.py::test_scan_json_long_structs()
These failures are a result of #11711. When the JSON parser attempts to read integral struct members from a JSON file, if the parsing leads to an overflow, then the
STRUCT
column value is deemed null on Databricks 14.3 (i.e. withoutspark-rapids
active). This behaviour differs from that exhibited by Apache Spark versions exceeding 3.4.1.This commit breaks out the problematic JSON test rows into a separate file, whose read is tested in an
xfail
for Databricks 14.3. The remaining rows are tested on all versions.The true fix for #11711 will be addressed later.