paulfitz / daff

align and compare tables
https://paulfitz.github.io/daff
MIT License
790 stars 68 forks source link

Add updated columns summary #167

Open hasnat opened 3 years ago

hasnat commented 3 years ago

To possibly have more extensive report on summary for changes per column. TODO: update map to <String,Int> to capture column names instead of index

note: not sure why tests are failing.

note2: possibly this MR is not the way, and some other way is preferred.

paulfitz commented 3 years ago

Thanks @hasnat. The test failure seems to be mostly due to a haxe change, where the return type of the main method is constrained to be void. I expect that could be easy to fix, just by calling Sys.exit if the program exit code should be non-zero.

Is there a test case you can think of to show the expected behavior of this change? You can find some existing checks on getSummary in harness/BasicTest.hx. These tests can be run with make ntest_js (replace js with your preferred language).

hasnat commented 3 years ago

thanks @paulfitz for the directions on test cases. I can try adding a couple along with haxe main change.

my aim is to have summary include following details (satisfies my usecase) {column_units_updates: { columnA:1, columnB:2.... } }

not sure if this should be complete with inserts/deletes info, and which way to nest if we're completing set ? {column_units_updates: { columnA:1, columnB:2.... } } {column_units_inserts: { columnA:1, columnB:1.... } } {column_units_deletes: { columnB:1.... } } OR {column_units_updates: { columnA:{inserts:1, updates: 1}, columnB:{inserts:1, deletes:1, updates:2}.... } }

hasnat commented 3 years ago

hi @paulfitz slight updates to this PR this is going way beyond the diff specs here https://paulfitz.github.io/daff-doc/spec.html

initial aim of mine was to capture diff/diff types counts against columns e.g. given two tables here https://github.com/hasnat/daff/blob/patch-1/test/test_column_diff_summary2.js#L5

Ideal diff am after will visually provide data as such. (possibly meta info in squeezed in cols schema... see meta below)

Screenshot 2021-03-07 at 17 57 44

code in this PR is far from optimal and not intendented for merge yet. However I'd like to discuss further if this is something way off daff or we can steer this draft to right path?

note: this branch as now will not show full row additions/removals properly yet. they are expected to stay as they are in initial specs fixed

meta (column_units_updates) from start of draft now looks like this

"column_units_updates": {
    "h": {
      "Color": {
        "h": {
          "=": 9,
          "+": 1
        }
      },
      "Name": {
        "h": {
          "+": 1,
          "=": 6,
          "-": 2,
          "!=": 1
        }
      },
      "Number": {
        "h": {
          "=": 10
        }
      }
    }
  },