UniversalDependencies / UD_German-GSD

Other
18 stars 5 forks source link

PRON vs DET for Possesivpronomen #22

Closed kanayamah closed 4 years ago

kanayamah commented 4 years ago

Many of possessive pronouns (mein, dein, ...) have PRON UPOS even though they work as det as in (1) below, but sometimes tagged as DET as in (2). It is reasonable to tag PRON for the real pronoun cases as (3) - though they are rare - and only those cases they have the lemma ich insated of mein. Why don't you rely on the original PoS tag (XPOS) PPOSAT/PPOSS to consistently tag them DET and PRON?

(1)

# sent_id = train-s11
# text = nach fast einer weiteren Stunde, nachdem sich mein Verlobter mehrmals erkundigt und zuletzt sogar beschwert hatte kam endlich der Konditor.
9   mein    mein    PRON    PPOSAT  Case=Nom|Gender=Masc|Number=Sing|Poss=Yes   10  det:poss    _   _
10  Verlobter   Verlobte    NOUN    NN  Case=Nom|Gender=Masc|Number=Sing    12  nsubj   _   _

(2)

# sent_id = train-s666
# text = Habe meinen Sinus pilonidalis hier behandeln lassen und kann es nur jedem weiterempfehlen.
2   meinen  mein    DET PPOSAT  Case=Acc|Gender=Masc|Number=Sing|Person=1|Poss=Yes|PronType=Prs 3   det _   _
3   Sinus   Sinus   PROPN   NN  Case=Acc|Gender=Masc|Number=Sing    7   obj _   _
4   pilonidalis pilonidalis PROPN   NE  Case=Acc|Gender=Masc|Number=Sing    3   flat    _   _

(3)


# sent_id = train-s1611
# text = Manche Männer können da ein Hemmschuh sein, meiner hat mir sehr geholfen.
9   meiner  ich PRON    PPOSS   Case=Nom|Gender=Masc|Number=Sing|Poss=Yes   13  nsubj   _   _
10  hat haben   AUX VAFIN   Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   ```
dan-zeman commented 4 years ago

Why don't you rely on the original PoS tag (XPOS) PPOSAT/PPOSS to consistently tag them DET and PRON?

In this corpus, XPOS is not "original". It was added later, together with the morphological features and lemmas, and it was predicted automatically.

But otherwise I agree that German possessives should be tagged DET.

kanayamah commented 4 years ago

@dan-zeman thank you for explanation. Waiting for your fix!

dan-zeman commented 4 years ago

Fixed in the dev branch. It was done by a script, so certain cases may be still unresolved. For example, ihr is ambiguous between non-possessive 2nd person plural pronoun, 3rd person singular feminine possessive determiner, 3rd person plural possessive determiner and (if upper/lowercase cannot be trusted) 2nd person honorific possessive determiner.