Closed Stormur closed 1 year ago
I wonder whether each of the three issues should be a separate Github issue?
Regarding the first one, here is the current usage in treebanks:
xcomp:pred
is used in Irish, Scottish Gaelic, Manx, North Sami, Latin, and Polish (at least in some treebanks of those languages)xcomp:sp
is used in Ukrainianadvcl:pred
is used in Latinadvcl:sp
is used in Ukrainiannmod:pred
is used in Polish for a similar purpose (with nominalized copulas)ccomp:pred
in Hungarian is probably also relatedProbably unrelated:
case:pred
in Welsh is used for predicative particlesdiscourse:sp
is used in Classical Chinese, Chinese and Cantonese and it stands for "sentence particle"I wonder whether each of the three issues should be a separate Github issue?
They are in fact more or less uncorrelated, but I felt like compacting them so as not to clog the issues... but if we agree, we can split them and I will present them separetely in future, too!
My point was that the discussion for each of them may diverge in the future, and it could become a mess if interleaved in one thread. Although there is obviously the common goal of annotating the same thing the same way across treebanks.
We used :sp
more as a placeholder in an anticipation that the community will eventually come up with a proper name. No problem renaming to :pred
.
@Stormur, done, renamed to :pred
.
I have recently noticed from one of the recent discussions that Ukrainian treebanks make use of the subtype
sp
withadvcl
orxcomp
for so-called secondary predications. Now, this subtype happens to be a doppelgänger of what we are using for Latin:pred
.So the question is: can we converge on one common tag (that will hopefully be used by other treebanks, too)?
xcomp:pred
seems to be used in a very similar sense by other treebanks, too. Some considerations about the possible tag:pred
;advcl
orxcomp
.So what should we choose? Maybe something like
secpred
in the spirit ofrelcl
? Should we maybe converge onpred
or onsp
?pred
is slightly more used currently, but I am not sure if it always has the same meaning.For
AdvType
, we haveLoc
(of location) andTim
(of time), but there is also aTemp
. Probably we should just converge toTim
keeping the overall trilitteral symmetry.For comparative constructions, there are both relation subtypes and feature values oscillating between
[Cc]mp
,comp
,cmpr
, which are variously distributed. We should probably choose only one for coherence. Maybe the best choice could becmp
/Cmp
, in "deference" to the "original" value for adjectival degrees.