Open NathanD38 opened 3 years ago
Hi @NathanD38 - these are important questions and there are several here, so let me clarify:
aux
to anything, so in "hu tsafuy lalexet" it's AUX, but in "tsafuy laredet gesehem" it's VERB (because it's not an auxiliary to anything, so it can't be AUX). This type of impersonal VERBal construction is basically in the same bucket as efshar and other things tagged VERB which lack the canonical VERB paradigm (impersonal verbs/verboids)Now about the binyan:
As a result of the 'axioms' above, forms like "katuv" cannot be considered a separate binyan (or at least traditionally they are not). The reason for placing them with PAAL is that historically in Semitic, each binyan has associated participles, and PAUL is the passive participle corresponding to the PAAL form (so, katav <> katuv, and active kotev). In Hebrew, the tensed passive equivalent of PAAL (Arabic fu'ila) was lost, and we mostly get nif'al (like ratsax<>nirtsax; incidentally, this replacement has happened in colloquial Arabic as well). But still the participle PUAL is counted as binyan PAAL (the same holds for Arabic maf'uul, e.g. maktuub).
Finally regarding feats, xashuv is not a modal (it's an evaluative, sure, but so is "tov"), so it cannot have VerbType=Mod, and in fact if we think it's not a VERB in general, it should just be ADJ, and not receive Voice at all. But katuv should be VERB, Voice=Pass, HebBinyan=PAAL, VerbForm=Part whenever it is not a lexicalized adjective.
Hope that makes sense!
So, if I understand correctly, beinoni paul (such as katuv) is never an ADJ? What about when it follows an ADP, like in the following sentence: במקביל, אושר בחוק לבנקים לקבל דמי הפצה מהקרנות והקופות, כנהוג במכירת ניירות ערך לציבור. If it is a verb, can it receive case from the ADP? Or we should assign it upos=ADJ in cases like this?
@amir-zeldes Thank you for the detailed answer!
Is there a list we can access or compile ourselves of lexicalized adjectives (not just from beinoni paul)? Do you consider other beinoni paul forms as lexicalized adjectives, apart from "xashuv"?
To understand the features given in each case:
the modal AUX
עשוי, עלול, צפוי, אמור in the below example
"hu tsafuy/amur/alul/asuy lalexet", will get the following features:
Person=3, Number=Sing, Gender=Masc, VerbType=Mod
the VERB
in the impersonal construction below,
"tsafuy/amur/alul/asuy laredet geshem", will get the following:
Person=3, Number=Sing, Gender=Masc, Tense=Pres, VerbForm=Part, HebBinyan=PAAL, Voice=Pass
the ADJ
in the following lexicalized beinoni paul,
"xashuv she-moshe yagi'a la-pgisha", will get the following:
Gender=Masc, Number=Sing
What @shirawigi showed above is the tendency of certain forms of beinoni paul to appear after an explicit ה,
which may be an SCONJ+mark
or DET+det
, deprending on the upos (a tricky decision in an of itself);
or implicit ה within ADP
like כ in כנהוג, כאמור, כצפוי, etc. or ב in באמור לעיל.
I see in the current HTB, that כאמור is analyzed as one token, with upos ADV
, and receives advmod
.
This is perhaps expected because of its tendency to appear by itself, as a paranthetical, referring to a previous sentence. paragraph, notion or idea.
But when it comes with complements, such as the following example, how should we analyze it, or indeed, any other such form? גלישה או שימוש בשירותי האתר מהווה הסכמה לאמור בהסכם זה "glisha o shimush be-sherutey ha-atar mehava haskama la-amur be-heskem ze."
@shirawigi :
beinoni paul (such as katuv) is never an ADJ
Not necessarily all, but for katuv it's hard to imagine. It is VERB 6/6 times in HTB.
If it is a verb, can it receive case from the ADP?
No, then it would be advcl, with SCONJ+mark. In this case it also makes sense, since you can insert a "by" phrase (ka-nahug 'al yedey kulam)
Is there a list we can access or compile ourselves of lexicalized adjectives
I don't have one handy, but maybe a search in HTB could help. I think you should mainly test it linguistically - if something is a passive verb form, it should have a relationship to the active. Something "katuv" has been written by someone, as is related to active ("katuv al yedey", "katvu oto"). The same is not true for "xashuv" (*xashuv al yedey, ??xashvu oto)
The feats you have look mostly right, but I don't think alul is passive at all, and arguably asuy and amur aren't really either, since they do not correspond to actives/can't accept agents, etc. In HTB they do not have voice at all, but anyway they are always tagged AUX... the "rain" example is a rare kind.
Your last example I think is a nominalization (=that which is said), so it should not be a VERB there, but that's an exception.
Your last example I think is a nominalization (=that which is said), so it should not be a VERB there, but that's an exception.
So what is the upos and deprel in that example? I understand it is nominalization,
but I'm not sure if you mean it should be ADJ
.
In similar instances with ב/לכל הקשור ל (=In all that which is related to), HTB has it as VERB
with Voice=Act
.
And in some of these, the deprel is dep
(kashur, kol), mark
(kashur, ha), case
(kashur, be),
in others, dep
(kashur, ha) and det
(kashur, kol).
I do not understand completely the distinction between beinoni paul and poel, if in some cases, the paul is not even passive. If, by now, they are no longer considered passives, then it seems that they've become lexicalized adjectives, or on their way to that function.
The agent test or by-phrase doesn't always result in a clear cut decision. To me, I cannot entirely say "hu kashur al yedey moshe" and be happy with it, but I'm fairly happy with "hu nikshar al yedey moshe ve-axshav hu kashur". xatum/katuv/etc. al-yedey X is really preculiar to me, and I would expect past-tense NIFAL in those instances, with the beinoni paul signifying a present result.
So what is the upos and deprel in that example?
NOUN and nmod, I think the guidelines already say that explicitly about nominalized participles
For kashur I would actually consider it totally lexicalized, like xashuv, since it does not correspond to "koshrim oto" (at least for me), let alone a by phrase. This feeling is validated by similar forms in other words, for example English "related to" is treated as ADJ with lemma "related":
http://match.grew.fr/?corpus=UD_English-EWT@2.8&custom=6143acaf5f1fe&eud=yes
However "kashur be-xevel" would be VERB and Voice=Pass according to current HTB practices, and I think it is OK (it is not some lexicalized meaning but a totally transparent passive participle of "likshor").
@amir-zeldes
I would like to know what is the correct upos and features of beinoni paul. Should it have the upos
ADJ
with the featuresGender
andNumber
, or the uposVERB
with the strange tagging ofHebBinyan=PAAL
andVoice=Pass
?In somce cases it is tagged AUX before an infinitive (צפוי, עשוי, עלול, ראוי, etc.)
I think that if we want to tag it as
VERB
, then we should addHebBinyan=PAUL
, because the combination ofHebBinyan=PAAL
andVoice=Pass
is peculiar.Consequently, should the tagging be consistent throughout the various constructions or varied based on individual cases? (e.g., בנוי, כרוך, מצוי)
Let's take חשוב which here seems to be a standard
ADJ
, declineable forNumber
andGender
: ההנחיות החשובות סייעו לעובדים בשמירה על בטיחותם.And compare it to the indeclineable form, whose upos is not clear to me:
הצלחה בתחום הזה תשפיע משמעותית על אימוץ הטכנולוגיה הזו בקרב כל משתמשי הדרך, ולשם כך חשוב להיות גם יצירתיים, גם יסודיים בבחינת הפתרונות בשטח וגם גמישים במתן הפתרונות המתאימים.
If the upos is
ADJ
, and we havecsubj
(xashuv, lihyot), is it ok to haveVerbType=Mod
on anADJ
in this case? If the upos isVERB
, technically it should becsubj:pass
(xashuv, lihyot), since it comes fromHebBinyan=PAAL
andVoice=Pass
.A clearer guidline would help in the current batch and in the QA process.
Thanks! Netanel