Open somelinguist opened 2 years ago
I like these ideas. It seems like we need "namespaces" for features, and this would be a way of providing it. This is nicer than requiring that the features for Sbj and Obj have different names.
Apertium has discussed reserving :
in tags for other uses, but .
should be fine.
Correction: There's two different senses of "should be fine": <sbj.num>
will be processed in the input stream without issue, but if someone writes tags="sbj.num"
, that will be processed as referring to <sbj><num>
. Writing tags="sbj\.num"
would probably fix that, but it's probably better to find a different separator. Maybe sbj|num
could work?
Currently, it's only possible to reference feature values in transfer rules, and this is done by defining attributes with corresponding values.
Because only values are output, feature values need to be unique to work correctly, at least within a rule.
This makes it hard to work with multiple complex features like subject and object agreement that might have sub-features that use the same list of possible values like person and number.
For example, an irregularly inflecting verb that means "say" might have the features like
[sbj:[num:sg][pers:3]][obj:[num:sg][pers:2]]
in FLEx. With the right rule, a form with such features could be output as something likehit1.1 sg sg 3 2
in FLExTrans, which makes it impossible(?) to synthesize correctly.If there were a way to refer to the feature name/path in addition to the value, it seems like it would be possible to write a rule that would correctly synthesize/match.
Some ideas:
name:value
sbj.num
(a dot.
probably isn't the best path separator if Apertium has problems with that).These are just some ideas, which might be hard to implement or not worth implementing, especially if they caused incompatibilities with previous versions.