Closed alex-mirkin closed 1 month ago
Thanks, I'll try to fix it soon.
I pushed the fix to PR #44.
Since you already have benchmarks in place, could I inconvenience you to check that there is not too much degradation in performance?
Looks good! Sure, the fixed version has about 90% speed of the original one on the smaller list sizes (5000) and about the same performance on the larger ones (50,000, 100,000).
Great, thanks!
Describe the bug
HashDiffer’s handling of duplicate rows only partially supports expected behavior. Specifically, there are two scenarios that need to be addressed:
While the first scenario is correctly handled, the second one is not.
Steps to reproduce - use the following values:
The expected output:
The actual output:
In the actual output, the
(6, "ABCDE")
row difference is not detected, even though it exists once in list a but twice in list b.