MrPowers / chispa

PySpark test helper methods with beautiful error messages
https://mrpowers.github.io/chispa/
MIT License
595 stars 65 forks source link

Row equality formatter #92

Closed MrPowers closed 7 months ago

MrPowers commented 7 months ago

This PR makes the DataFrame inequality formatting totally configurable, here is an example:

Screenshot 2024-02-17 at 11 03 13 AM

The user can specify formats and use them as follows:

@dataclass
class MyFormats:
    mismatched_rows = ["light_yellow"]
    matched_rows = ["cyan", "bold"]
    mismatched_cells = ["purple"]
    matched_cells = ["blue"]

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=MyFormats())

They can also define these formats in conftest.py and inject them via a fixture:

@pytest.fixture()
def my_formats():
    return MyFormats()

def test_shows_assert_basic_rows_equality(my_formats):
  ...
  assert_basic_rows_equality(df1.collect(), df2.collect(), formats=my_formats)

The current settings hardcode blue, red, and white, which isn't great for some terminals. This is more flexible. The output is also cleaner because the Row() clutter has been removed.