paulfitz / daff

align and compare tables
https://paulfitz.github.io/daff
MIT License
800 stars 67 forks source link

TypeError when using daff on JSON #139

Closed wessport closed 5 years ago

wessport commented 5 years ago

Hello!

I'm trying to compare two json files, but I keep encountering the following error:

$daff version 1.3.40 
$daff diff --output test_output.csv test1_output.json test2_output.json
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 2735, in loadTable
    json = python_lib_Json.loads(txt,None,None,python_Lib.dictToAnon)
TypeError: loads() takes 1 positional argument but 4 were given

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 3467, in run
    a = self.loadTable(aname,"local")
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 2745, in loadTable
    raise _HxException(e)

daff._HxException: TypeError('loads() takes 1 positional argument but 4 were given',)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff", line 11, in <module>
    load_entry_point('daff==1.3.40', 'console_scripts', 'daff')()
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 11870, in main
    Coopy.main()
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 3649, in main
    ret = coopy.coopyhx(io)
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 3535, in coopyhx
    return self.run(args,io)
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/daff.py", line 3522, in run
    raise _HxException(e1)
daff._HxException: TypeError('loads() takes 1 positional argument but 4 were given',)

I've put together two example json files in the same format as the files I'm working with:

test1.json

{"t1":[{"branch":"branch1","value":{"name":"Jane","number":14}},{"branch":"branch2","value":{"name":"Jane","number":14}}]}
{"t2":[{"branch":"branch1","value":{"name":"John","number":55}},{"branch":"branch2","value":{"name":"John","number":88}}]}

test2.json

{"t1":[{"branch":"branch1","value":{"name":"Wes","number":12}},{"branch":"branch2","value":{"name":"Jane","number":14}}]}
{"t2":[{"branch":"branch1","value":{"name":"John","number":55}},{"branch":"branch2","value":{"name":"John","number":88}}]}

I've attempted different json formats which all result in the same error.

Beginner here so there's a good chance I'm doing something wrong on my end. I had success by flattening and converting the json to csv files, but it would be nice if I could just work with the raw json files.

Any advice on getting around this error would be much appreciated! Awesome library by the way!

paulfitz commented 5 years ago

One format that works with daff is ndjson with unnested hashes. For example, comparing:

thing1.ndjson

{"t": "t1", "branch": "branch1", "name": "Jane", "number": "14"}
{"t": "t1", "branch": "branch2", "name": "Jane", "number": "14"}
{"t": "t2", "branch": "branch1", "name": "John", "number": "55"}
{"t": "t2", "branch": "branch2", "name": "John", "number": "88"}

thing2.ndjson

{"t": "t1", "branch": "branch1", "name": "Wes", "number": "12"}
{"t": "t1", "branch": "branch2", "name": "Jane", "number": "14"}
{"t": "t2", "branch": "branch1", "name": "John", "number": "55"}
{"t": "t2", "branch": "branch2", "name": "John", "number": "88"}

Gives:

@@,name    ,number,branch ,t
→ ,Jane→Wes,14→12 ,branch1,t1

But it doesn't have logic for comparing arbitrary json.

wessport commented 5 years ago

@paulfitz thank you for the explanation! I'll continue the flattening approach then. Thanks!