In https://github.com/Project-OSRM/osrm-text-instructions/pull/283#pullrequestreview-221816101, @xendez and @jppcel contributed new translations for Portuguese and pointed out that certain names would need a different preposition-article contraction depending on the grammatical gender of the name. This PR adapts the French grammatical system that @yuryleb implemented in #252.
I came up with the list of rules and tests by querying OpenStreetMap for name tags on highway ways in Lisbon and Rio de Janeiro. (I’m not sure if those two cities are representative of road type designations elsewhere in Lusophone countries, but we can always manually add more road type designations.) I isolated the road type designations but stripping all but first word of each multiword name, then removing duplicates, given names, and acronyms. Finally, I looked up the grammatical gender of each word in the English and Portuguese Wiktionaries and identified a limited set of lexical patterns. (Hopefully my limited Spanish didn’t bias the final rules in any way.)
Because I started with road names, these rules are somewhat unlikely to have great results against the place names that one would see in the destination and waypoint_name variables. But again it’s just a start.
A starter list of abbreviations has also been added based on some common abbreviations I found on street names in Lisbon and Rio de Janeiro.
Before merging, it’d be great to get feedback on the following:
Does it make sense to apply these rules to all instances of “em” and “a” followed by destination, junction_name, way_name, or waypoint_name?
What prepositions should we fall back on when we’re unable to determine the grammatical gender lexically? (There seems to be a lot of ambiguity in Portuguese due to etymology that isn’t obvious from the spelling.)
Are there any cases where a particular road type designation shouldn’t be preceded by an article?
Are there any adjectives that can be either gender and commonly precede the road type designation (akin to “gran” in Spanish)?
Issue
In https://github.com/Project-OSRM/osrm-text-instructions/pull/283#pullrequestreview-221816101, @xendez and @jppcel contributed new translations for Portuguese and pointed out that certain names would need a different preposition-article contraction depending on the grammatical gender of the name. This PR adapts the French grammatical system that @yuryleb implemented in #252.
I came up with the list of rules and tests by querying OpenStreetMap for
name
tags onhighway
ways in Lisbon and Rio de Janeiro. (I’m not sure if those two cities are representative of road type designations elsewhere in Lusophone countries, but we can always manually add more road type designations.) I isolated the road type designations but stripping all but first word of each multiword name, then removing duplicates, given names, and acronyms. Finally, I looked up the grammatical gender of each word in the English and Portuguese Wiktionaries and identified a limited set of lexical patterns. (Hopefully my limited Spanish didn’t bias the final rules in any way.)Because I started with road names, these rules are somewhat unlikely to have great results against the place names that one would see in the
destination
andwaypoint_name
variables. But again it’s just a start.A starter list of abbreviations has also been added based on some common abbreviations I found on street names in Lisbon and Rio de Janeiro.
Before merging, it’d be great to get feedback on the following:
destination
,junction_name
,way_name
, orwaypoint_name
?Tasklist
Requirements / Relations
Depends on #283.
/cc @danpaz