own-pt / sensetion.el

Emacs word-sense annotation interface
GNU General Public License v3.0
4 stars 2 forks source link

diff between import vs export file from corpus #176

Closed arademaker closed 4 years ago

arademaker commented 4 years ago
sort ../data/mongo/glosstag.json > orig.json
sort glosstag.1.json > anot.json
diff orig.json anot.json

why the diff between the json imported into mongodb and the exported shows a LOT of differences?

arademaker commented 4 years ago

diff is also huge without the sort

odanoburu commented 4 years ago

sorting using sort doesn't work since it only reorders lines, and we might want to reorder pairs in a json dictionary.

running

jq -s -S -c 'sort_by(._id)[]' glosstag.json > glosstag-ordered.json

--s: read all lines into array --S: sort keys --c: compact (versus pretty-printed) output

I've followed your steps here and it shows the expected output ~minus a small bug in alist<->json conversion (in the sense that what used to be an array of arrays of two elements was converted to a dictionary, which is isomorphic but will cause spurious diffs.) this doesn't affect functionality since it only happens in one of the meta fields of the glosstag conversion. I'll fix this but it's probably due to common lisp and emacs lisp treating the conversion differently.~ (this has been fixed)