delph-in / pydelphin

Python libraries for DELPH-IN
https://pydelphin.readthedocs.io/
MIT License
79 stars 27 forks source link

dmrx codec does not suppress properties when properties=False #306

Closed arademaker closed 4 years ago

arademaker commented 4 years ago

The properties parameter is being ignored, am I right?

>> print(dmrx.encode(d, properties = False, indent = True))
<dmrs cfrom="-1" cto="-1" top="10002" index="10002">
<node nodeid="10000" cfrom="0" cto="6"><gpred>proper_q</gpred><sortinfo /></node>
<node nodeid="10001" cfrom="0" cto="6" carg="Abrams"><gpred>named</gpred><sortinfo PERS="3" NUM="sg" IND="+" cvarsort="x" /></node>
<node nodeid="10002" cfrom="7" cto="15"><realpred lemma="promise" pos="v" sense="1" /><sortinfo SF="prop" TENSE="past" MOOD="indicative" PROG="-" PERF="-" cvarsort="e" /></node>
<node nodeid="10003" cfrom="16" cto="19"><realpred lemma="the" pos="q" /><sortinfo /></node>
<node nodeid="10004" cfrom="20" cto="23"><realpred lemma="dog" pos="n" sense="1" /><sortinfo PERS="3" NUM="sg" IND="+" PT="pt" cvarsort="x" /></node>
<node nodeid="10005" cfrom="27" cto="31"><realpred lemma="bark" pos="v" sense="1" /><sortinfo SF="prop-or-ques" TENSE="untensed" MOOD="indicative" PROG="-" PERF="-" cvarsort="e" /></node>
<link from="10000" to="10001"><rargname>RSTR</rargname><post>H</post></link>
<link from="10002" to="10001"><rargname>ARG1</rargname><post>NEQ</post></link>
<link from="10002" to="10004"><rargname>ARG2</rargname><post>NEQ</post></link>
<link from="10002" to="10005"><rargname>ARG3</rargname><post>H</post></link>
<link from="10003" to="10004"><rargname>RSTR</rargname><post>H</post></link>
<link from="10005" to="10001"><rargname>ARG1</rargname><post>NEQ</post></link>
</dmrs>
goodmami commented 4 years ago

You're correct. The API documentation says it should work, so this is a bug. The DTD requires the <sortinfo> element to be there to be valid, but the feature attributes are optional, so it could still be DTD compliant by positing an empty <sortinfo /> element instead. Then again, we're already "ahead" of the DTD by using the top and index attributes on <dmrs>, so maybe it's the DTD that needs an update.

Either way the fix is trivial. Care to submit a PR? Look in delphin/codecs/dmrx.py for the node-encoding function.

arademaker commented 4 years ago

Sure, I will make a PR.