own-pt / delphin-rdf

RDF specifications for DELPH-IN semantic representations and a Pydelphin plugin for RDF generation.
MIT License
2 stars 3 forks source link

DMRS representation #21

Open arademaker opened 3 years ago

arademaker commented 3 years ago

Considering the graph

image

We want it to be as closer as possible of its graphical representation

image
arademaker commented 3 years ago
  1. We need rdfs:label for the nodes with the predText value
  2. we need rdfs:label for the links with the edges labels such as ARG1/EQ
arademaker commented 3 years ago

Eventually, we would like to be as close as possible to the XML serialization. See https://github.com/delph-in/pydelphin/issues/329.

<dmrs cfrom="-1" cto="-1" top="10003" index="10003">
  <node nodeid="10000" cfrom="0" cto="13">
    <gpred>udef_q</gpred>
    <sortinfo />
  </node>
  <node nodeid="10001" cfrom="0" cto="6">
    <realpred lemma="mask" pos="v" sense="1" />
    <sortinfo SF="prop" TENSE="untensed" MOOD="indicative" PROG="bool" PERF="-" cvarsort="e" />
  </node>
  <node nodeid="10002" cfrom="7" cto="13">
    <realpred lemma="people" pos="n" sense="of" />
    <sortinfo PERS="3" NUM="pl" IND="+" cvarsort="x" />
  </node>
  <node nodeid="10003" cfrom="18" cto="25">
    <realpred lemma="look" pos="v" sense="1" />
    <sortinfo SF="prop" TENSE="pres" MOOD="indicative" PROG="+" PERF="-" cvarsort="e" />
  </node>
...  
  <node nodeid="10010" cfrom="51" cto="52">
    <realpred lemma="a" pos="q" />
    <sortinfo />
  </node>
  <node nodeid="10011" cfrom="53" cto="59">
    <realpred lemma="forest" pos="n" sense="of" />
    <sortinfo PERS="3" NUM="sg" IND="+" cvarsort="x" />
  </node>
  <link from="10000" to="10002">
    <rargname>RSTR</rargname>
    <post>H</post>
  </link>
  <link from="10001" to="10002"><rargname>ARG2</rargname><post>EQ</post></link>
  <link from="10003" to="10002"><rargname>ARG1</rargname><post>NEQ</post></link>
  <link from="10004" to="10003"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10004" to="10008"><rargname>ARG2</rargname><post>NEQ</post></link>
  <link from="10005" to="10008"><rargname>RSTR</rargname><post>H</post></link>
  <link from="10006" to="10008"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10007" to="10006"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10009" to="10003"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10009" to="10011"><rargname>ARG2</rargname><post>NEQ</post></link>
  <link from="10010" to="10011"><rargname>RSTR</rargname><post>H</post></link>
</dmrs>
arademaker commented 3 years ago

After the hashtag, we should not use / separators. But note that hashtag means the identification of a specific part of something. For instance, URL uses that for indicating sections of pages located in services: https://github.com/delph-in/docs/wiki/AceInstall#building-ace means the page AceInstall located in the path /delph-in/docs/wiki/ in the server https://github.com with a section building-ace. So in http://ibm.com/sick/b/33/4/nodes/10014#predicate we are talking about the predicate part/section of 10014 an item in the collection (path, folder) sick/b/33/4/nodes. This is how normally with think, right?

URIs, some initial thoughts:

One idea is to make the namespace of a profile as flat as possible. Note that in the current situation, 2 and 3, is not compatible with 1 and 4.

  1. from http://ibm.com/sick/b/33/4/dmrsi#dmrs to http://ibm.com/sick/b/result-33-4
  2. from http://ibm.com/sick/b/33/4/nodes/10012 to http://ibm.com/sick/b/node-33-4-10012
  3. from http://ibm.com/sick/b/33/4/links/8 to http://ibm.com/sick/b/link-33-4-8
  4. from http://ibm.com/sick/b/33/4/nodes/10014#predicate to http://ibm.com/sick/b/predicate-33-4-10014

We need a new node http://ibm.com/sick/b/item-33 (or sentence-XX) and make the property text be a property from that obj to the string. Make all results connected to that sentence by hasResult (I am trying to follow the terminology from the profiles)

One alternative would be to think that we have a hierarchical structure of collections . The profile is sick/b, that in turn have another collection 33 with an item 4 that have its parts:

  1. from http://ibm.com/sick/b/33/4/dmrsi#dmrs to http://ibm.com/sick/b/33/4
  2. from http://ibm.com/sick/b/33/4/nodes/10012 to http://ibm.com/sick/b/33/4#node-10012
  3. from http://ibm.com/sick/b/33/4/links/8 to http://ibm.com/sick/b/33/4#link-8
  4. from http://ibm.com/sick/b/33/4/nodes/10014#predicate to http://ibm.com/sick/b/33/4#predicate-10014

Another alternative would be to push the hierarchical structure as far as possible. We would need to mix the identifiers with names (like function names and arguments). The b profile has an item 33 that has a result 4 ...

  1. from http://ibm.com/sick/b/33/4/dmrsi#dmrs to http://ibm.com/sick/b/item/33/res/4
  2. from http://ibm.com/sick/b/33/4/nodes/10012 to http://ibm.com/sick/b/item/33/res/4/node/10012
  3. from http://ibm.com/sick/b/33/4/links/8 to http://ibm.com/sick/b/item/33/res/4/link/8
  4. from http://ibm.com/sick/b/33/4/nodes/10014#predicate to http://ibm.com/sick/b/item/33/res/4/node/10014/predicate
arademaker commented 3 years ago
image

Using the rdfs:label! I am not showing the dmrs node that link to all the nodes and links in the graph above.

arademaker commented 3 years ago

If http://ibm.com/sick/a/1/0#dmrs is the #dmrs section of a document 0 in the collection 1 part of collection a... So http://ibm.com/sick/a/1/0#link-10 is a section of the same document 0. But #link-10 is actually part of #dmrs, right?

yfaria commented 2 years ago

In fact, this version of the code has this flaw in URI construction for not considering the which semantic representation the node is in the construction. As the names of those elements are different among the three different semantic representations that are being converted (EDS, MRS, DMRS), it doesn't create problems, but it is better having this indication. The construction of the URIs on #26 consider those parts. Now there are three other URIs as well: the one of the profile, the profile item and the result of the item. Considering http://ibm.com/sick/b as the prefix given by the user, we have the profile URI would be the prefix itself, http://ibm.com/sick/b; the item of id 33 would have the URI http://ibm.com/sick/b/33; the fourth result would be http://ibm.com/sick/b/33/4; the DMRS URI of that result http://ibm.com/sick/b/33/4/dmrs; and, finally, an element of that would be preceded by a hash, so we would have http://ibm.com/sick/b/33/4/dmrs#link-10.

arademaker commented 2 years ago

yes, it seems reasonable.

arademaker commented 2 years ago

@yfaria is this issue closed? I am not sure.

To be concrete, I believe we may have applications that need the simpler possible RDF. That is the RDF closer to the graphical representation of DMRS.

We could potentially provide that function in this library, right?

Another question is how close we are to the XML representation DMRSs and if we should try to fix any detour that we may have made from it.