Desbordante / desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
GNU Affero General Public License v3.0
371 stars 66 forks source link

Implement OD verifier algorithm #420

Open vano105 opened 3 months ago

vano105 commented 3 months ago

Implement novel algorithm for validating canonical ODs (order dependencies). This algorithm receives as input left and right column indices, context, and a flag to determine dependency order (ascending/descending). The algorithm outputs rows violated by swaps or splits. Add unit tests for OD verification.

For more information about canonical ODs: http://www.vldb.org/pvldb/vol10/p721-szlichta.pdf