Project-OSRM / osrm-text-instructions

Text instructions from OSRM route responses
BSD 2-Clause "Simplified" License
86 stars 60 forks source link

Automatically choose preposition-article contraction in Portuguese #294

Open 1ec5 opened 4 years ago

1ec5 commented 4 years ago

Issue

In https://github.com/Project-OSRM/osrm-text-instructions/pull/283#pullrequestreview-221816101, @xendez and @jppcel contributed new translations for Portuguese and pointed out that certain names would need a different preposition-article contraction depending on the grammatical gender of the name. This PR adapts the French grammatical system that @yuryleb implemented in #252.

I came up with the list of rules and tests by querying OpenStreetMap for name tags on highway ways in Lisbon and Rio de Janeiro. (I’m not sure if those two cities are representative of road type designations elsewhere in Lusophone countries, but we can always manually add more road type designations.) I isolated the road type designations but stripping all but first word of each multiword name, then removing duplicates, given names, and acronyms. Finally, I looked up the grammatical gender of each word in the English and Portuguese Wiktionaries and identified a limited set of lexical patterns. (Hopefully my limited Spanish didn’t bias the final rules in any way.)

Because I started with road names, these rules are somewhat unlikely to have great results against the place names that one would see in the destination and waypoint_name variables. But again it’s just a start.

A starter list of abbreviations has also been added based on some common abbreviations I found on street names in Lisbon and Rio de Janeiro.

Before merging, it’d be great to get feedback on the following:

Tasklist

Requirements / Relations

Depends on #283.

/cc @danpaz