delph-in / matrix

The Grammar Matrix
https://matrix.ling.washington.edu/index.html
Other
12 stars 6 forks source link

semi.vpm #117

Open goodmami opened 4 years ago

goodmami commented 4 years ago

Migrated from Trac:

Automatically create a semi.vpm file for the customized grammars. The basic version of this functionality should look through the choices file for all features and values that can appear on semantic variables. (This would be: TENSE, ASPECT, MOOD, PERSON, NUMBER, PERNUM, GENDER, SF, COG-ST and maybe others.) Determine which features are used in the customized grammar and which values are possible for each feature. Similarly, look in the Other Features section for any user-defined features that appear on semantic variables. Then create a semi.vpm block that maps each specific value to itself and arbitrarily picks one (e.g., the first one defined in the choices file) for the default (e.g., present << [e] under E.TENSE : TENSE).

In general, the values will just be preserved. However, the features that are tucked under E will be mapped out to features without E in the path, and PERNUM should be mapped to separate PER and NUM dimensions.

Complications:

A more advanced version (once the basic version is working) could take user input to do the following:

goodmami commented 4 years ago

This seems mostly done already. We currently customize a semi.vpm with some things from the choices files. Is there anything actionable left here?

emilymbender commented 4 years ago

I think the automatically created semi.vpm files are still somewhat broken. Among other things, they map out to things like E.TENSE instead of just TENSE and could likely have better handling of defaults. You can see what I tell Ling 567 students to change about them under "Variable Property Mapping" here: http://courses.washington.edu/ling567/lab8.html (NB: that link will break in Jan 2021, but be archived under ling567/2020/lab8.html). I'm not 100% confident that what I guide them to is really best practice though.

goodmami commented 4 years ago

Ok it sounds like we're mainly just mapping to the same structures and values when we should be going to the "external" property scheme. For instance, here's how the code customizes the person properties:

https://github.com/delph-in/matrix/blob/686f07203b4a8f7efc3daa4c836d0b625cdcb948/gmcs/linglib/agreement_features.py#L145-L159

At the feature level the fix seems pretty straightforward. More challenging is mapping to the proper values. I don't know if there is any published set of valid ones. Also, creating some no-aspect (or similar) values might be a challenge, as we would need to choose an available name if one doesn't exist already.