alpheios-project / arethusa-configs

Additional configuration files for Arethusa
0 stars 9 forks source link

Latin gerund does not need Person, Number, Tense, Mood #51

Open TDBuck opened 8 years ago

TDBuck commented 8 years ago

Moving this issue over from https://github.com/alpheios-project/arethusa/issues/768#

Will make changes to tagset.

TDBuck commented 8 years ago

@gcelano @nevenjovanovic I made the proposed changes to the testlat tagset. As soon as the pull request is approved you should be able to look at the proposed changes to the gerund in that tagset.

balmas commented 8 years ago

@gcelano @nevenjovanovic this change to the testlat tagset is deployed now -- change the format of a treebank file to testlat to test it.

balmas commented 8 years ago

@TDBuck I think this report by @jmharrington references at least some of what you have fixed here, but possibly additional changes required?

Gerund should have: CASE pulldown (cannot be nominative) GENDER is NEUTER NUMBER is SINGULAR
TENSE is PRESENT VOICE is ACTIVE should have no options for PERSON

Supine should have: CASE pulldown (can only be accusative, ablative, or very rarely dative ending in -UI) GENDER is MASCULINE NUMBER is SINGULAR
TENSE is PRESENT VOICE is ACTIVE should have no options for PERSON

balmas commented 8 years ago

@gcelano @nevenjovanovic can you take a look at Matthew's suggestions above and confirm that you agree? The testlat tagset as it currently is seems wrong.

Matthew would like to: -reinstate case

gcelano commented 8 years ago

The gerund has case and voice (please, note that no one asked to delete them in the previous issue). As for the other features Matthew suggests, they are correct. Actually, gender, number, and tense can always be inferred, so they could in principle be omitted: but a more declarative style is, all in all, better. All of these problems are going to be discussed again in the near future in the context of the UD annotation scheme, which we should try to second as far as possible.

nevenjovanovic commented 8 years ago

Dear All, I agree with @gcelano that we need all six cases for the Latin gerund (let's keep the nominative and the vocative, just in case). I also strongly agree with @TDBuck on the need to have a Latin morphology 'school set', as streamlined as possible; we need it for pedagogical reasons (not to confuse people), for efficiency (to minimize the possibility of errors which would then propagate to Arethusa), and for tactical purposes (to avoid scandalizing the classics professors when they use Arethusa). Preparing such a pedagogical school set would be a good Teach the Teachers task. I don't know what the UD acronym stands for, if @gcelano would explain, this would be most welcome.

gcelano commented 8 years ago

Universal Dependencies

balmas commented 8 years ago

@TDBuck I'm a bit lost on where we are with this. Can you review the comments and update the test tagset accordingly? Thanks!

TDBuck commented 8 years ago

I'm a little lost here too. @balmas correct me if I'm wrong but my understanding is that you can't have one tag with two different sets of values. So you could not make "gender" have a different set of values for nouns and gerunds. What you could do is create two tags called "gender" one for gerunds and one for everything else. But that seems inefficient and like it would skew some queries.

If I'm wrong let me know, and I'll make some changes accordingly.

like @nevenjovanovic mentioned, having tiered latin tagsets is also an option. And one that I like. If you want me to get to work making those, I can do that as well.