MrPowers / chispa

PySpark test helper methods with beautiful error messages
https://mrpowers.github.io/chispa/
MIT License
595 stars 65 forks source link

feat: Introduce `FormattingConfig` and deprecate `DefaultFormats` #127

Closed fpgmaas closed 2 months ago

fpgmaas commented 2 months ago

PR Checklist

Description of changes

This PR proposes to deprecate the existing methods of configuring the output format of chispa through the use of the DefaultFormats and arbitrary dataclasses. Currently there are a few issues with this approach;

As an alternative, this PR proposes to replace it with the classes Color, Style, Format, and FormattingConfig, so it can be used as

from chispa.formatting import FormattingConfig

formats = FormattingConfig(
        mismatched_rows={"color": "light_yellow"},
        matched_rows={"color": "cyan", "style": "bold"},
        mismatched_cells={"color": "purple"},
        matched_cells={"color": "blue"},
    )

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=formats)

or

formats = FormattingConfig(
        mismatched_rows={"color": "light_yellow"},
    )

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=formats)

or similarly:

from chispa.formatting import FormattingConfig, Color, Style

formats = FormattingConfig(
        mismatched_rows={"color": Color.LIGHT_YELLOW},
        matched_rows={"color": Color.CYAN, "style": Style.BOLD},
        mismatched_cells={"color": Color.PURPLE},
        matched_cells={"color": Color.BLUE},
    )

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=formats)

This brings:


Notes

fpgmaas commented 2 months ago

@SemyonSinchenko I have some more ideas for improvements based on your feedback. Will change it to a draft PR and get back to you soon!