delph-in / docs

DELPH-IN Documentation
https://delph-in.github.io/docs/
28 stars 4 forks source link

documentation about semantic representations #15

Open arademaker opened 3 years ago

arademaker commented 3 years ago

One missing featuree of the github wiki is that it does not send emails notifying changes in the wiki, right? I would be nice to let people know when we make modifications..

I just added https://github.com/delph-in/docs/wiki/RmrsDmrs from the information I got from https://github.com/delph-in/pydelphin/issues/329#issuecomment-872125764.

I believe we need, for all DELPH-IN semantic representations, a uniform documentation about:

  1. the semantics
  2. the abstract syntax
  3. the concrete syntax (XML, RDF, JSON etc)

More? Why the name of the page is RmrsDmrs? Can we rename it to Dmrs only? Why using the prefix Rmrs?

arademaker commented 3 years ago

Probably because of historical reasons, the DTD (XML representation schemas) of MRS, DMRS etc are in http://svn.emmtee.net/trunk/, mixed with the LKB source code. I would like to propose to move those DTD to this repository under a folder called schemas.

goodmami commented 3 years ago

One missing featuree of the github wiki is that it does not send emails notifying changes in the wiki, right?

I'm quite happy to not receive emails for wiki edits and rely on the "All activity" notices at github.com, but, unfortunately for those who wish to receive more email, it's not exactly possible to configure GitHub to send emails for wiki edits. You can, in theory, get an Atom feed of the wiki (source), however it timed out when I tried it. Maybe our wiki is too big for that functionality.

Why the name of the page is RmrsDmrs? Can we rename it to Dmrs only? Why using the prefix Rmrs?

It's just historical. I believe there was a time when all MRS-related wikis were created under the "Rmrs" namespace.

I would like to propose to move those DTD to this repository under a folder called schemas.

I think that's a great idea. I also created RelaxNG versions of the schemas before, and they were a bit less ERG-centric, too.

arademaker commented 3 years ago

In 1981f0e, I created the folder schemas and copied from the lkb/src the dtd files I found.

arademaker commented 3 years ago

Hi @goodmami, do you still have your RelatexNG schemas? I found in https://github.com/delph-in/docs/wiki/MrsRFC something for the MRX.

goodmami commented 3 years ago

The MRX one is as on the wiki. I have added it and the DMRX one to this repo. I could only find a partially-finished dmrs.rnc file, so I just now finished it up. The notes below regard dmrs.rnc.

There are two main differences from the DTD:

You might find that this does not validate DMRSs from PyDelphin nor from the LKB (I tested LKB-FOS) for several reasons:

  1. Missing <dmrs-list> top-element (both LKB and PyDelphin; depends on how encoding functions are called)
  2. Property names are printed in upper-case (PyDelphin, see delph-in/pydelphin#333)
  3. Underspecified property values are not u, but bool/pers/etc. (both LKB and PyDelphin)
  4. Specified boolean values are + and -, not plus and minus (PyDelphin)
  5. prontype property name is spelled pt (both LKB and PyDelphin)

(4) is funny because the DTD only has plus and minus due to the inability of DTDs to specify + as an attribute value, so it appears to be just a hackish workaround, and it seems the LKB anticipates this and outputs plus and minus but PyDelphin does not. (3) and (5) are a mismatch between the grammar definitions and the DTD.

I therefore made the DTD easy to customize for a grammar. One could either edit the file directly or create a new RelaxNG file and import dmrs.rnc to replace some definitions. Here's an example of the latter:

# File: dmrs-erg-2020.rnc
# Note: assumes dmrs.rnc is in the same directory

include "dmrs.rnc" {

  # Allow either <dmrs> or <dmrs-list> as root
  start = Dmrs | DmrsList

  # Redefine property attributes for ERG-2020
  Properties = attribute num { "sg"|"pl"|"number" }?,
               attribute pers { "1"|"2"|"3"|"pers" }?,
               attribute gend { "m"|"f"|"n"|"m-or-f"|"gender" }?,
               attribute sf { "prop"|"ques"|"comm"|"prop-or-ques"|"sf" }?,
               attribute tense { "past"|"pres"|"fut"|"tensed"|"untensed"|"tense" }?,
               attribute mood { "indicative"|"subjunctive"|"mood" }?,
               attribute pt { "std"|"zero"|"refl"|"notpro"|"pt" }?,
               # Allow all of plus, minus, +, and - to accommodate both the LKB and PyDelphin
               attribute prog { "plus"|"minus"|"+"|"-"|"bool" }?,
               attribute perf { "plus"|"minus"|"+"|"-"|"bool" }?,
               attribute ind { "plus"|"minus"|"+"|"-"|"bool" }?

}

You can then use it with Jing as follows:

$ jing -c dmrs-erg-2020.rnc dmrs.xml