tibetan-nlp / annotation-docs

Tibetan annotation docs
2 stars 0 forks source link

Tagging of End of Quotation (zhes bya ba)? #5

Closed solmsdorf closed 7 years ago

solmsdorf commented 7 years ago

How should we annotate (zhes) bya ba/zer ba "thus said/so-called" (end of quotation)? Like a copula?

E. g. from Mila 11a: zhing skal (arg1?) bre pe stan chung bya ba (cop?)

heacu commented 7 years ago

There are several issues here that need to be untangled.

  1. What to do about the (optional) quotative verb zhes in zhing skal bre pe stan zhes chung bya ba? We could consider it a content word and so the argcl of bya ba, with bre pe stan chung an argument of zhes. Or, we could consider zhes a function word, like a complementizer that introduces an embedded clause. bre pe stan chung would then be an argument of bya ba, and zhes would depend on bre pe stan chung, perhaps via mark, which however is intended to introduce finite subordinate clauses.

  2. What argument role does zhing skal get? We agreed in earlier discussion that it should be an argument of bya ba but should it be arg1 as suggested above or arg2? Given that an agent (the speech source) is in principle possible - if not in this exact construction - then I would label it an arg2.

  3. What argument role does the head of the reported speech phrase get? In the version of the sentence without zhes, I would label bre pe stan chung an arg3.

There is one advantage of treating zhes as a function word, which is that the sentences with zhes are treated the same as the sentences without it, with regard to the key dependency relations. But if this does disservice to the syntax (for example, if there is evidence that constructions with and without zhes behave differently), then a uniform analysis might not be advisable.

I suggest that we proceed with a tentative approach, and revisit the issue later when we have a catalogue of cases to consider. It is worth following much of this up on the UD list, but I think we should wait until we have more and varied examples and have also translated them so that our questions will be more easily understood.

solmsdorf commented 7 years ago

We are rather uncertain about the qualification of zhes as a verb. Could someone please elaborate on this (in excess of the short interpretation in the SOAS system: http://larkpie.net/tibetancorpus/node/113706)? Etymologically this may be the case, however, it's usage suggest something else; it's allomorphs ces and zhes depend on the final of the preceding syllable, which also bespeaks a qualification as a function word.

That is to say, we definitely lean towards a qualification of zhes as a function word. Given the high degree of its optionality we would rather ignore it in annotating verb dependencies.

However, more examples are needed and will be uploaded here.

heacu commented 7 years ago

Good point about the allomorphs.

So as we collect more attestations for inclusion here, let's proceed in the meanwhile on the assumption that zhes / ces will depend as mark on the head of the speech complement.

We need not indicate the mark relationship yet, since we aren't linking to other mark. Technically, this violates our principle that Every verb should be annotated and no verb should be ignored, but since the verbhood of zhes / ced is being called into question, this seems like a legitimate exception.

heacu commented 7 years ago

Examples of zhes copied in from Slack for the documentation via Samyo:

  1. gral pa arg1 rnams snyan gsan argcl par zhu argcl zhes brjod nas “she said: ‘elders, please listen closely’” (11a)
  2. a mas arg1 … pha shul arg2 bus arg1 ‘dzin argcl du ‘jug argcl par zhu argcl byas pas “my mother said: ‘I ask you to let my son take possession of his patrimony’” (11a)
  3. a ma ni arg1 … da gzigs pa’i dus obl la bab argcl gda’o aux zer “my mother said: ‘the time has come that you look [upon us]‘” (11b)

Again, we think that zhes is not a verb but a suffix to mark the end of quotation.

solmsdorf commented 7 years ago

Room for further examples:

  1. gung thang gi lhun grub ’gron khang bya bar "in a [place] called Lhun grub Guest House in Gung thang" (13b)
heacu commented 7 years ago

@nikolaisolmsdorf If you are okay with https://tibetan-nlp.github.io/lim-annodoc/#quotatives, then I'll close this issue. If you want to debate the POS tag for zhes / ced, then please take that up on Slack in the #tagging channel where Nathan will see it.

solmsdorf commented 7 years ago

@torma Many thanks, the entry in the lim-annodoc is fine, however, how do we deal with so-called/named etc. usages of zhes? Cf. the above-mentioned example from Mila 11a:

zhing skal bre pe stan chung bya ba

For the time being, we annotated it like this: zhing skal arg1 bre pe stan chung arg2 bya ba

Still, we are not sure if this is a reasonable approach.

heacu commented 7 years ago

Hi -

It is worth noting that this example shouldn't even be on page 11a. If we had page 10b, it would appear on 10b, since our protocol is to reach greedily across the page until the next sentence boundary. Below is the full sentence:

ཨ་མ་ལ་ཕ་མས་བྱིན་པའི་ཞིང་སྐལ་བྲེ་པེ་སྟན་ཆུང་བྱ་བ་མིང་མི་སྙན་རུང་སྟོན་ཐོག་ཡོང་པ་ཅིག་ཡོད་པ་དེ།

Which Andy Quintman translates as follows:

At that time there was a field given to my mother by her parents as her inheritance, known by the unpleasant name Trepé Tenchung5 (Little Boot Sole) but producing an excellent harvest.

If it didn't require modifying all of the word offsets, I would be tempted to remove the sentence from the page. Instead, for the time being we can leave the partial sentence, which you have analysed like so:

bya_ba

The problem I have with this analysis is that it treats the clauses as entirely independent from each other, whereas the relations between them need to be indicated. For example, snyan should be linked as advcl to yong pa, since a rung clause adverbially modifies another clause. Even after this relation is inserted, there is no link between zhing skal ... bya ba and the remainder of the sentence, which seems odd. Is it possible that the correct analysis is:

(ཨ་མ་ལ་ཕ་མས་བྱིན་པའི་ཞིང་སྐལ་)1 (བྲེ་པེ་སྟན་ཆུང་བྱ་བ་)2 (མིང་མི་སྙན་རུང་སྟོན་ཐོག་ཡོང་པ་)3 ཅིག་ཡོད་པ་དེ།

In which zhing skal is actually arg1 to yod pa, and clauses 2 and 3 are modifiers (say, post-nominal headless relative clauses) of 1. If so, then bya ba would have only one argument, bre pe stan chung, which I would mark as arg2.

Ed

nh36 commented 7 years ago

Hey folks. The reason for seeing zhes as a verb is purely morphological. zhes is a past, zhe is a present/future. All the things that can come after verbs like -na, and -pa-dang, can come after zhes. It is true that we have some allomorphy here and that this is unique to this verb, but so what, it is still a verb. Also, as discussed in this chain, other verbs like byed, and bgyid, can function in a similar way, and in their cases no one would suggest that they are not verbs. So, if we already know verba dicendi form a certain class and byed and bgyid are members of this class, what is the harm in seeing zhes as another example of this type? If you want examples, they can be found in the training corpus by looking at zhe or ces-pa, etc.

heacu commented 7 years ago

Ok, based on the discussion above and a conversation with @nikolaisolmsdorf today, we will close this issue having resolved for now the following: