IAHLT / UD_Hebrew

Hebrew Universal Dependencies Treebank
Other
2 stars 2 forks source link

List of AUX #53

Open shirawigi opened 2 years ago

shirawigi commented 2 years ago

Hi @amir-zeldes,

During validation, we noticed that there are several verbs which were annotated as AUX even though they are not included in our AUX list. Two of these verbs, which we think might be included in the AUX list are רשאי and מועד, for example:

  1. במקרים מסוימים, רשאי הרכב השופטים הדן בתיק להחליט על הרחבת ההרכב.
  2. קבוצות אתניות כמו שחורים והיספנים מועדים ללקות בה בין פי שניים לפי שלושה מלבנים. Both these verbs have other meanings of course, but maybe when they preceed an infinitive they should be tagged as AUX? As זכאי is on the AUX list, and it is pretty similar to רשאי, we thought it might also be an AUX.

What so you think? Thanks!

amir-zeldes commented 2 years ago

Hm, on the one hand I don't like inflating the list of aux; on the other hand, רשאי is quite similar to English "may", which is included by virtue of being one of the English morphosyntactic modal verbs. I guess it boils down to what we want to consider the "lexical" root of the clause - is 1. a sentence about "deciding" or about what one may or may not do? I agree that if we include זכאי we should also include רשאי, they are basically the same. And we have עלול, so I suppose מועד is not a stretch either... But I would be happy to hear more opinions, as this is rather subjective.

Hilla-Merhav commented 2 years ago

@amir-zeldes I agree, if זכאי and עלול are auxiliaries then I think we can also add רשאי and מועד to the list :) What do you think about nahag + infinitive ? הצליינים נוהגים להשאיר בסדק הזה פתקים

One of the functions that used for the documentation of auxiliaries is Periphrastic aspect: progressive, and I think this makes נהג (only in the structure nahag + infinitive) a candidate for the AUX list. We have another periphrastic progressive auxiliry - היה, but this is used in past tense only: הצליינים היו משאירים בסדק הזה פתקים

amir-zeldes commented 2 years ago

German has exactly this kind of verb too, "pflegen" (with a to-infinitive complement), but it is analyzed as the head + xcomp, see result 5 here:

http://match.grew.fr/?corpus=UD_German-HDT@2.9&custom=61dc75dbb8c9d

I fear we are inflating the AUX list a bit liberally, so I would probably treat nohagim as the head.

yaelFinkelshtein commented 2 years ago

@amir-zeldes I just encountered "יצטרך לעמוד בנטל ההוכחה", and as "צריך" is an auxiliary, I was pretty sure that "הצטרך" as a lemma is also suppose to be in the list, and it currently isn't there. If we add it - does it mean that we'll add the "hebBinyan" feature to it? right now all auxiliaries don't get it (should we add "PAAL" for "היה"?)

amir-zeldes commented 2 years ago

I think if we annotate צריך as AUX then it should apply to יצטרך as well, and any proper template verb, incl. היה should get HebBinyan, even if it's AUX