Open IsraelLand opened 2 years ago
Your analysis of the validator's behavior for מישהו is accurate: if it's a NOUN it can't have those features. I should say that the indefinite substitutives not being pronouns is probably borrowed from English, which ultimately goes back to the PTB tagset's decision to tag them NN. In many other languages (e.g. Slavic), these are all tagged PRON, and if you wanted to do that (and if we update it in HTB), I wouldn't necessarily be against it; on the other hand, it's not a big deal since this is a closed class of items.
As for איפשהו איכשהו, that is not correct: the validator does allow ADV to carry PronType, since in many languages pronominal adverbs are still tagged ADV. I believe the top of the PronType page confirms this:
As for the ADVs, my bad, I misread. I see you mentioned their ability to get subrels as well.
As for the original question - what would you prefer? I think PRON grealy captures מישהו, with the added benefit of being able to subrel it.
Otherwise, should we aspire for uniformity across this "subset" of words?
For example, other words, like משהו - which is mostly noun (as is מישהו), but the pretty similar German (et)was - is mostly PRON (where it isn't ADV) in most TBs, but PUD (in which it is noun).
The only real reason to keep NOUN is backwards compatibility with HTB, but if we change it in the corrected HTB then I would be OK with PRON
Thank you!
Hi @amir-zeldes
What POS whould we apply for מישהו מטעמו - Obviously it's not the first time dealing with this, but these do come up -
So if we assume it to be noun, we cannot tag Ind, but we're supposed to according to the guidelines.
We can do either -
the same goes for איפשהו איכשהו which, assuming are ADV, cannot get their specific subrels?
Thank you