UniversalDependencies / UD_Chinese-GSDSimp

Conversion of UD_Chinese-GSD to simplified Chinese characters.
Other
35 stars 5 forks source link

Aspect markers should be `aux:aspect` instead of `case:aspect` #5

Open qipeng opened 4 years ago

qipeng commented 4 years ago

Aspect markers like (le) and (guo4) should be annotated as auxiliary verbs rather than case markers.

qipeng commented 4 years ago

Fixed in 19f801cc9da3dd3b791062f939c40b302e6e96dc.

KoichiYasuoka commented 4 years ago

Umm... How about discourse:sp as used in UD_Chinese-CFL?

qipeng commented 4 years ago

@KoichiYasuoka good question! I think actually according to https://universaldependencies.org/u/dep/discourse.html, many of the ones fixed don't qualify as discourse markers. An example in Chinese-CFL is https://github.com/UniversalDependencies/UD_Chinese-CFL/blob/master/zh_cfl-ud-test.conllu#L68

The sentence-final s are a bit more debatable -- many of them do have an aspect role to it, because often constructions 回 家 了 (hui2 jia1 le, went home) and 回 了 家 (hui2 le jia1, went home) have the exact same meaning, and it's not exactly like 回 家 吧 (hui2 jia1 ba, (let's) go home) where (ba) is clearly a discourse marker and the alternative construction simply wouldn't work.

But it does seem like the UPOS on these are also ill-defined -- should they be AUX instead?

ermanh commented 4 years ago

FYI, when I worked on the projects Chinese-CFL and Chinese-HK, we developed and used the following guideline (I don't think it has ever been linked anywhere on the UD website pages, but the pages are there; we took a broad, liberal interpretation of "discourse" to apply to sentence final particles since many of them have discursive and/or pragmatic usages):

https://universaldependencies.org/zh/dep/discourse-sp.html

The last paragraph addresses how we handled the issue of ambiguity for 了 in sentence-final position:

To differentiate between the perfective aspect marker 了 / le and the sentence-final particle 了 / le, one should define as sentence-final a 了 / le which is placed at the end of a clause or a sentence (though it may be followed by additional sentence-final particles), unless a clear context makes it possible to determine that it is the aspect marker. Before an object, adverbial of duration or frequency, and other non-sentence-final elements, 了 / le will always be annotated as an aspect marker with aux.

So we chose to default to discourse:sp when 了 is at the end of the sentence (yes, we acknowledged that it was a largely arbitrary decision). Otherwise it is just aux, at least for our two projects (we decided against subcategorizing into aux:aspect).

qipeng commented 4 years ago

Thanks for the input @ermanh, this is a very helpful perspective! It seems we're in agreement about s that appear in the middle of a sentence (and pretty clear discourse markers like , , although it's definitely helpful to see examples like 罢了 and 而已).

Frankly, I still find myself struggling to count many sentence-final s as merely a discourse marker in many cases, because of the usually grammatical (and often semantically identical) rearrangements like VERB OBJ 了 vs VERB 了 OBJ.

ermanh commented 4 years ago

I think in cases such as the following it is more obviously a sentence-final particle than an aspect marker:

(1) 他可以回家了 (2) 我很想回家了 (3) 她又在咳嗽了

In these sentences, 了 is not modifying the (nearest) verb -- for (1) and (2) 回家 hasn't started yet, and for (3) 咳嗽 is progressive because of 在). It's also not modifying the auxiliary 可以 in (1) or the verb 想 in (2) either.

It's not modifying the verb in these sentences in terms of indicating aspect of the verb, but instead modifying the entire sentence or the proposition expressed by the sentence, indicating there is a change of state or a new situation. One could translate the above sentences as something similar to "It has now become the case that...(1) he can go home / (2) I really want to go home / (3) she's coughing again."

qipeng commented 4 years ago

@ermanh Thank you for sharing these cases! Seems like transposition between and the direct object at the end is a good test! Intransitive verbs are probably a good feature for the decision boundary as well (coughing, in your example). I might add that ADJ 了 is more often a discourse marker than not, even though the boundary is murky between ADJs and VERBs in many cases.

ermanh commented 4 years ago

@qipeng Agreed on the adjectives (specifically stative verbs), they should not be able to take on the perfective aspect 了 since stative verbs are inherently atelic, so if they're followed by a 了 at the end of the sentence the 了 should be a sentence-final particle.

But I'm not sure intransitivity is a defining diagnostic for this? E.g. [她睡了(一整天)], [我哭了之後][就突然笑了一下]. The intransitive verbs 睡 "sleep", 哭 "cry", 笑 "laugh" can be followed by the aspect marker 了 in non-final position, so that means if the above sentences end after 了, it may be ambiguous whether it is the aspect marker or the sentence-final particle. Sometimes context can help disambiguate, but without context it's a toss.

qipeng commented 4 years ago

Great point about intransitive verbs! Need to think a bit more about how to distinguish as discourse/aspect markers in a sentence final position after intransitive verbs...

In the meantime I plan to start fixing UPOS for aspect markers in GSD and GSDSimp soon-ish.

KoichiYasuoka commented 4 years ago

Well, I'm still vague that we can really treat several 了 as AUX. I think that typical auxiliary verbs (可, 被, 会, ...) are placed before verbs. How about 了?

qipeng commented 4 years ago

I think when 了 functions as an aspect marker, then by UD convention it should be classified as an AUX. This is a bit like "have" in "have done" in English.

Another example of AUX-like aspect marker that's placed after the verb I can think of in Chinese is "好/完", e.g., "吃 好 饭" (done having a meal) and "做 完 作业" (done with homework) where although both "好/完" mean something along the lines of "finished", they are really supplementing the meaning of the main verbs.

ermanh commented 4 years ago

In Chinese-HK and Chinese-CFL, we tagged aspectual use of 了 as AUX for the same reason @qipeng mentioned -- UD's AUX includes markers that convey aspect (besides tense, modality, etc.). The other two post-verbal aspect markers we included are 過 (experiential) and 著 (durative).

Another example of AUX-like aspect marker that's placed after the verb I can think of in Chinese is "好/完"

We tagged these as ADJ or VERB accordingly because they pattern syntactically like in resultative compounds/result complements (e.g., (1) 我[哭濕]了衣服 ("I cried my clothes wet"), (2) 他[騎累]了馬 ("He rode the horse until he/it got tired"), where the second verb in the brackets describes the state that is brought about as a result of the first verb. And like in these compounds 好 and 完 can co-occur with aspectual 了 ((3) 吃好了飯, (4) 做完了作業). One can insert 得 or 不 in between the two verbs ((5) 做[得/不][完/好] -- not possible with 了, (6) *做[得/不]了).

Agreed that 好 and 完 contain an aspectual meaning in these constructions, but that seems to have more to do with the inherent semantics of these words themselves in combination with the resultative compound construction itself (something that becomes good (好) as a result of some other action is likely successful and completed; 完 "finish" already contains an end point in time in its meaning). They do seem to be somewhat grammaticalized, but they still behave syntactically more like verbs than like aspect markers such as 了, 過, and 著.