Open nschneid opened 6 years ago
OK how about these guidelines: https://universaldependencies.org/en/pos/ADV.html
Implemented in EWT! (modulo some existing PronType=Int
annotations that should be PronType=Rel
)
So we should update none
in PUD to be PRON
with PronType=Neg
?
(among other changes)
anything to be done for however
? that was left out of the EWT updates
anyway
?
any_ADV
, there_PRON
left blank?
anything to be done for
however
? that was left out of the EWT updates
anyway
?
These are both mainly discourse connectives, so I'm not sure they need a PronType.
any_ADV
,there_PRON
left blank?
there_PRON: for expletive "there" I'm not sure if any of the PronType values would be a good fit. This is documented at https://universaldependencies.org/en/pos/PRON.html#expletive-there
any_ADV: "any" is normally DET. I see "any/ADV longer/ADV" and similar; not sure this is actually correct. Also "it doesn't hurt any/ADV" (= at all). Could these be DET attaching as advmod
? Feels related to "some/DET 540,000 men". Curious to hear @amir-zeldes's take when he's back from vacation.
however
mainly discourse connectives
Agreed that the discourse versions are fine w/o. They are not always discourse, though, especially however
:
# sent_id = email-enronsent24_01-0036
# text = My goal, however optimistic, is to execute the risk policy by the end of today.
4 however however ADV RB _ 5 advmod 5:advmod _
5 optimistic optimistic ADJ JJ Degree=Pos 2 amod 2:amod SpaceAfter=No
# sent_id = email-enronsent24_01-0093
# text = My goal, however optimistic, is to execute the risk policy by the end of today.
# sent_id = reviews-332105-0004
# text = I will reccommend his services however/whenever possible!
6 however however ADV WRB PronType=Int 3 advmod 3:advmod|9:advmod SpaceAfter=No
7 / / SYM SYM _ 8 cc 8:cc SpaceAfter=No
8 whenever whenever ADV WRB PronType=Rel 6 conj 3:advmod|6:conj|9:advmod _
(those are the only ones I saw for however
)
Technically you're right, the "however optimistic" ones should be PronType=Int
. I suppose these are just uses of "however" that modify a non-predicate ADJ or ADV.
"however/whenever possible": as "however" is the first item in coordination I suppose it should be the head of the free relative
Technically you're right
(insert satisfied seal meme here)
Aha, apparently "however" receives a different xpos: RB for the discourse connective use and WRB for the interrogative or relative use! (This is documented in the PTB tagging guidelines.) So we can require PronType conditional on that.
It is not obvious how pronouns should be lemmatized (cf. #276 for Slavic). The UD_English corpus does the following:
Nominative (
PRP
):Accusative (
PRP
):Dependent possessive (
PRP$
):The pattern here is that they are normalized to nominative case, except for "my" and "its", which should probably be "I" and "it", respectively.
Independent possessive (
PRP
, no morphological features): mine, yours, ours, theirs, etc.: no normalizationReflexive (
PRP
): myself, yourself, ourselves, yourselves, themselves, etc.: no normalizationWH animate: who, whom, whoever, whomever: no normalization
I am not sure why whom, whomever, the independent possessives, and the reflexives aren't normalized to nominative as well.
There is one token where ’s in Let’s has been lemmatized as us (it should presumably be we for consistency).
That said, the simplest policy may be to use the lemma field only for spelling normalization (#513) and not perform case normalization at all. If the end user wants to map pronouns to nominative case, that is not hard to implement as postprocessing once spelling is consistent.
Thoughts?