paulfitz / daff

align and compare tables
https://paulfitz.github.io/daff
MIT License
790 stars 68 forks source link

[Python] [Bug] TableDiff.hasDifference is sensitive to order when inputs contain multiple identical rows containing `None`, even if CompareFlags.order is False #200

Open MichelleArk opened 3 months ago

MichelleArk commented 3 months ago

Problem:

TableDiff.hasDifference unexpectedly returns True when comparing two input tables that have multiple identical rows with containing None, even ifCompareFlags.orderis set toFalse`.

I've observed this for the python daff package, not sure if it is an issue in other language bindings.

Repro case:

import daff

table1 = daff.PythonTableView(
    [
        ['id', 'status'],
        [None, 'B'],
        [None, 'B'],
        [3, 'A']
    ]
)
table2 = daff.PythonTableView(
    [
        ['id', 'status'],
        [3, 'A'],
        [None, 'B'],
        [None, 'B']
    ]
)

flags = daff.CompareFlags()
flags.ordered = False

alignment = daff.Coopy.compareTables(table1, table2, flags).align()
result = daff.PythonTableView([])

diff = daff.TableDiff(alignment, flags)
diff.hilite(result)

assert not diff.hasDifference() # AssertionError

Workaround in dbt-core: https://github.com/dbt-labs/dbt-core/pull/10202