Open TDBuck opened 8 years ago
@gcelano @nevenjovanovic I made the proposed changes to the testlat tagset. As soon as the pull request is approved you should be able to look at the proposed changes to the gerund in that tagset.
@gcelano @nevenjovanovic this change to the testlat tagset is deployed now -- change the format of a treebank file to testlat to test it.
@TDBuck I think this report by @jmharrington references at least some of what you have fixed here, but possibly additional changes required?
Gerund should have:
CASE pulldown (cannot be nominative)
GENDER is NEUTER
NUMBER is SINGULAR
TENSE is PRESENT
VOICE is ACTIVE
should have no options for PERSON
Supine should have:
CASE pulldown (can only be accusative, ablative, or very rarely dative ending in -UI)
GENDER is MASCULINE
NUMBER is SINGULAR
TENSE is PRESENT
VOICE is ACTIVE
should have no options for PERSON
@gcelano @nevenjovanovic can you take a look at Matthew's suggestions above and confirm that you agree? The testlat tagset as it currently is seems wrong.
Matthew would like to: -reinstate case
The gerund has case and voice (please, note that no one asked to delete them in the previous issue). As for the other features Matthew suggests, they are correct. Actually, gender, number, and tense can always be inferred, so they could in principle be omitted: but a more declarative style is, all in all, better. All of these problems are going to be discussed again in the near future in the context of the UD annotation scheme, which we should try to second as far as possible.
Dear All, I agree with @gcelano that we need all six cases for the Latin gerund (let's keep the nominative and the vocative, just in case). I also strongly agree with @TDBuck on the need to have a Latin morphology 'school set', as streamlined as possible; we need it for pedagogical reasons (not to confuse people), for efficiency (to minimize the possibility of errors which would then propagate to Arethusa), and for tactical purposes (to avoid scandalizing the classics professors when they use Arethusa). Preparing such a pedagogical school set would be a good Teach the Teachers task. I don't know what the UD acronym stands for, if @gcelano would explain, this would be most welcome.
Universal Dependencies
@TDBuck I'm a bit lost on where we are with this. Can you review the comments and update the test tagset accordingly? Thanks!
I'm a little lost here too. @balmas correct me if I'm wrong but my understanding is that you can't have one tag with two different sets of values. So you could not make "gender" have a different set of values for nouns and gerunds. What you could do is create two tags called "gender" one for gerunds and one for everything else. But that seems inefficient and like it would skew some queries.
If I'm wrong let me know, and I'll make some changes accordingly.
like @nevenjovanovic mentioned, having tiered latin tagsets is also an option. And one that I like. If you want me to get to work making those, I can do that as well.
Moving this issue over from https://github.com/alpheios-project/arethusa/issues/768#
Will make changes to tagset.