Open jeffbrennan opened 6 months ago
hey @MrPowers, this is ready for review. I based the output on how pyspark prints schemas with the printSchema()
function. Verified that this approach works with multiple levels of nested structs, schemas of different lengths, ignore_nullable T/F, and ignore_metadata T/F
Standard example:
Different lengths:
Multiple nested structs:
Some thoughts for additional features
Currently this auto-determines whether the output should be a table or a tree string based on whether the provided schemas have nested structs or > 10 columns. Should this be user-configurable?
Truncation of wide tree string outputs. If a tree string line is wider than a certain threshold (~80 characters), should I just truncate the right schema output, or both the left and right evenly?
Output looks good in the terminal but not in VSCode when an exception is caught. Should we just have a standard SchemasNotEqualError
message and log the diff tree/table separately in the terminal?
ignore_metadata = False is a little confusing currently (see name column in example below). Should we add a text indicator to explain that two lines are differing because of their metadata?
This fix is super helpful! Any chance it can get merged? Happy to try resolving the merge conflicts if you need help.
This fix is super helpful! Any chance it can get merged? Happy to try resolving the merge conflicts if you need help.
thanks for putting this back on my radar! I'll work on resolving the merge conflicts
@fpgmaas @SemyonSinchenko could one of y'all please review the changes after my recent updates?
For the merge conflicts, most of the differences were formatting/line length related and I took the incoming version from main in those instances.
I also experienced some pytest failures after the merge that I have addressed in the subsequent commits.
six
package in schema_comparer.py
but not in pyproject.toml
or the imports at the top of the filecreate_schema_comparison_tree
in test_schema_comparer.py
blue()
functionschema_comparer.py
Let me know if anything should be changed!
@jeffbrennan build is failing due to formatting issues. Could you run the pre-commit hooks locally? Sorry, we should reflect this in CONTRIBUTING.md
@jeffbrennan build is failing due to formatting issues. Could you run the pre-commit hooks locally? Sorry, we should reflect this in
CONTRIBUTING.md
@fpgmaas passing locally after my recent changes
addresses #88
See comment below