delph-in / erg

English Resource Grammar
MIT License
17 stars 3 forks source link

Possible SEM-I namespace collision: person #14

Closed goodmami closed 5 years ago

goodmami commented 5 years ago

In the SEM-I, person is the supertype of the values of the property PERS (in etc/erg.smi):

  person.
  1 < person.
  2 < person.
  3 < person.

It is also the name of a (normalized) predicate name (in etc/abstract.smi:

  person : ARG0 x { NUM sg }.
  person : ARG0 i.

These are not collisions in the grammar's type hierarchy for two reasons:

  1. The person and number are combined into the pn type and only separated in the VPM
  2. The person predicate is person_rel in the grammar

(so in fact the grammar does not person in its hierarchy while the SEM-I has it twice)

I have tagged this issue as a question because I am not certain that the whole SEM-I is meant to fit into a single hierarchy or if we interpret it as separate hierarchies for variables, properties, and predicates.

If we go with a single hierarchy, this shouldn't be too hard to fix. Since the supertype value person, if it appeared as the value of PERS on an MRS, would be dropped after going through the VPM, we don't actually see it in external MRSs. Therefore we can rename it without repercussions, I think. For example, we could make the following changes to etc/erg.smi:

-  x < i & p : PERS person, NUM number, GEND gender, IND bool, PT pt.
+  x < i & p : PERS pers, NUM number, GEND gender, IND bool, PT pt.

and

-  person.
-  1 < person.
-  2 < person.
-  3 < person.
+  pers.
+  1 < pers.
+  2 < pers.
+  3 < pers.

Nothing needs to change for semi.vpm since the supertype is not referenced, only the feature name and value subtypes.

fcbond commented 5 years ago

I think the person predicate name should be person_n (as it is only used for nouns).

On Thu, Apr 11, 2019 at 2:06 PM Michael Wayne Goodman < notifications@github.com> wrote:

In the SEM-I, person is the supertype of the values of the property PERS (in etc/erg.smi):

person. 1 < person. 2 < person. 3 < person.

It is also the name of a (normalized) predicate name (in etc/abstract.smi:

person : ARG0 x { NUM sg }. person : ARG0 i.

These are not collisions in the grammar's type hierarchy for two reasons:

  1. The person and number are combined into the pn type and only separated in the VPM
  2. The person predicate is person_rel in the grammar

(so in fact the grammar does not person in its hierarchy while the SEM-I has it twice)

I have tagged this issue as a question because I am not certain that the whole SEM-I is meant to fit into a single hierarchy or if we interpret it as separate hierarchies for variables, properties, and predicates.

If we go with a single hierarchy, this shouldn't be too hard to fix. Since the supertype value person, if it appeared as the value of PERS on an MRS, would be dropped after going through the VPM, we don't actually see it in external MRSs. Therefore we can rename it without repercussions, I think. For example, we could make the following changes to etc/erg.smi:

  • x < i & p : PERS person, NUM number, GEND gender, IND bool, PT pt.+ x < i & p : PERS pers, NUM number, GEND gender, IND bool, PT pt.

and

  • person.- 1 < person.- 2 < person.- 3 < person.+ pers.+ 1 < pers.+ 2 < pers.+ 3 < pers.

Nothing needs to change for semi.vpm since the supertype is not referenced, only the feature name and value subtypes.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/delph-in/erg/issues/14, or mute the thread https://github.com/notifications/unsubscribe-auth/ABD8xgvTL6Kve65coyhPLqjXM6GLExKNks5vftDWgaJpZM4cozvH .

-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University

goodmami commented 5 years ago

@fcbond that is certainly possible, and we have discussed putting POSs on abstract predicates (although we continue to insist that abstract predicates are not decomposable). I did not propose that change because the effects are much more drastic. My proposal does not change anything but the SEM-I, but changing the predicate name would render the treebanks that use person obsolete.

goodmami commented 5 years ago

Also, @oepen has responded to an email about this suggesting that these are in fact separate hierarchies so there would be no collision. I believe he is the authority on the matter of the SEM-I, but I'll wait and see if Dan has anything to add regarding the ERG before closing the issue.

danflick commented 5 years ago

Even though the separate hierarchies in the SEM-I technically avoid this name collision, as Stephan notes, I don't mind adjusting the naming to keep the predicate name distinct from the variable property name. While I agree with Francis that it will be more consistent to change person_rel to person_n_rel, I want to do this more systematically for our full inventory of abstract predicates, so that may take a little longer. But I will also follow Mike's suggestion and change person' topers' in erg.smi, to be checked in to trunk shortly.

goodmami commented 5 years ago

PyDelphin has begun the change to have 3 separate hierarchies instead of a unified one, so this is no longer technically an issue for me.

Changing 'person' to 'person_n' and doing so consistently with other abstract predicates sounds good (I guess it only affects future treebanks, not those built against older versions of the ERG). I like consistency.

goodmami commented 5 years ago

Closing this issue as it has been resolved in trunk and discussion seems to be over.

oepen commented 5 years ago

hi dan,

belatedly, i would advise against renaming ‘person’ to ‘pers’ in MRSs (i.e. i would in fact recommend you revert this change to the SEM-I in the trunk).

originally, i was a bit skeptical of this move because it seemed gratuitous and effectively eliminates a welcome test case for proper SEM-I software support. tonight, i was looking at the WeSearch index creation and query infrastructure and realized that renamings in the SEM-I need to be reflected there too (WeSearch internally uses RDF, i.e. the SEM-I hierarchy is partially replicated in the infrastructure).

i suspect this observation is likely representative of other consumers of MRSs and, thus, have a heightened sense now that we should be very conservative about interface changes—especially when there is no good reason requiring a one-to-one renaming.

greetings from far away :-), oe

On Fri, 12 Apr 2019 at 22:09 danflick notifications@github.com wrote:

Even though the separate hierarchies in the SEM-I technically avoid this name collision, as Stephan notes, I don't mind adjusting the naming to keep the predicate name distinct from the variable property name. While I agree with Francis that it will be more consistent to change person_rel to person_n_rel, I want to do this more systematically for our full inventory of abstract predicates, so that may take a little longer. But I will also follow Mike's suggestion and change person' to pers' in erg.smi, to be checked in to trunk shortly.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/delph-in/erg/issues/14#issuecomment-482706139, or mute the thread https://github.com/notifications/unsubscribe-auth/AIVW5PO94N8mhAe3axqooXngYY8mFHbqks5vgOfWgaJpZM4cozvH .