Closed rillian closed 5 years ago
Merging #80 into master will increase coverage by
1.01%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #80 +/- ##
==========================================
+ Coverage 86.22% 87.23% +1.01%
==========================================
Files 27 27
Lines 987 995 +8
==========================================
+ Hits 851 868 +17
+ Misses 136 127 -9
Impacted Files | Coverage Δ | |
---|---|---|
pyoracc/atf/common/atffile.py | 80.85% <100%> (+3.35%) |
:arrow_up: |
pyoracc/atf/common/atfyacc.py | 98.65% <0%> (+0.54%) |
:arrow_up: |
pyoracc/atf/common/atflex.py | 100% <0%> (+2.37%) |
:arrow_up: |
pyoracc/__init__.py | 75% <0%> (+12.5%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 39c612f...af363ab. Read the comment docs.
This can be merged after #77 and #79.
@rillian Can you please rebase this on the new master
to keep things a bit cleaner? Thanks!
\o/
@rillian I might be late. But you can use this to serialize any Python 3 object:
https://gist.github.com/jayanthjaiswal/b722625f0cebda14cdfaaa7e8b74c3ae
Thanks @jayanthjaiswal, that works too! It's similar to what I did, I think, but handles more types and does its own recursion instead of hooking into the JSONEncoder's traversal. Useful for the next time it comes up!
I wrote this to see what kind of syntax tree the parser was producing, and it took me a while to understand how, since json.dump() from the standard library doesn't work on general Python objects. I thought it worth including for those reasons, and as a useful way to import tablet data into other tools.
It currently produces a "flat" serialization without object names, which I found most useful for exploring. If there's interest in adding parsing (being able to import json and serialize it back into ATF) that would probably need to change.
The
to_json
method is general, so it would be nice to be able to call it on any of the tree objects. The easiest way to do that would be to add it tooraccobject
and then make all the objects in themodel
hierarchy inherit from that.The
to_json
method passes optional arguments on tojson.dump()
, but there's a problem withsort_keys
. TheMultilingual
objects store the unmarked language lines in a dictionary underNone
which can't be sorted with respect to the other strings. I've just left this as an xfail in the tests, since it's not the default. If you want to address this I can suggest changing the parser to substitute the overall language code of the tablet, or the empty string, or copying the whole tree and making a similar substitution before serializing. The latter would be expensive on corpus objects.NB Currently includes changes from #77 which I hope will be merged first, and from #79 to make the tests pass.