Open kylebgorman opened 1 month ago
What about animacy for nouns and aspect for verbs?
Animacy for nouns is exactly the same issue as gender for nouns. (There are of course a few nouns which have virile and inanimate versions; słoik comes to mind. But I wouldn't say nouns "decline for animacy", just that if you derive an animate from an inanimate noun, you inflect it slightly differently.)
I know a bit less about Polish verbs but I assume the system is like other Slavic languages. Most people would also, about Russian verbs for instance, say that aspect is inherent to a verb, because while there are sometimes perfective and imperfective variants of the same verb root, it is often hard to predict what prefix, suffix, or stem change will be used to generate the perfective, and there are many imperfectives without corresponding perspectives, or vice versa.
The way these features are written in basically fine, they just belong in a separate column.
OK, I moved gender, animacy (nouns) and aspect (verbs) to the 4th column and also switched to xz compression. Please, let me know if this is OK, as I need to do the same for Czech, Slovak and Ukrainian data.
LGTM all around.
I have one last thing for you while I have your attention; re: #2 there are an awful lot of feminines missing a gen.pl.; not sure if that's intentional or not.
Can you give any examples? I looked for feminine nouns not containing GEN;PL form and I got only "krzta", which is defective.
Can you give any examples? I looked for feminine nouns not containing GEN;PL form and I got only "krzta", which is defective.
This is an error on my part: my routine was expecting a unique gen.pl. and these words I'm seeing all have multiple gen.pl.s (e.g.: acerola).
pol.zip
, data from the online grammatical dictionary, includes in the 3rd column information about noun genders. This is not to spec. Polish nouns don't "decline for gender"; rather it is an inherent feature of the lexeme, and thus should not be present. During the discussion for the UniMorph 3 spec, I proposed that we include inherent lexical features in the fourth column, but I don't think this was ever put into action. I propose this just be placed in the fourth column (and removed from the 3rd).@wkieras