Project-OSRM / osrm-text-instructions

Text instructions from OSRM route responses
BSD 2-Clause "Simplified" License
87 stars 61 forks source link

Initial French grammar rules to insert articles to way names #252

Closed yuryleb closed 6 years ago

yuryleb commented 6 years ago

Issue

Initial French grammar rules prototype to insert articles to way names (#251).

Currently only Rue status street name is supported: french articles

Tasklist

Requirements / Relations

New languages/overrides/fr.js script adds article option to most way_name keywords during translations export from Transifex, languages/grammar/fr.json contains regular expressions (currently one) to process street names, changed languages.js loads and registers that French expressions file. test/grammar_test.js is extended with test for French.

yuryleb commented 6 years ago

The next step: french elisions

Penegal commented 6 years ago

@benjamintd: I would say the correct string would be de La Rochelle: La is an article which is not subject to elision, so it would have to stay, and, as it is part of the name, it should keep its capital first letter.

This would have been different if the name was, for instance, Les Marches, as the elision would form des Marches, without capital since de has none, but, since the La article is not subject to this kind of elision, it should keep its capital first letter.

Penegal commented 6 years ago

I also notice that grammar rules should be applied on two other cases:

  1. when a name is placed after at: for instance, the correct translation for You have arrived at Les Marches is Vous êtes arrivé aux Marches, with elision, and not Vous êtes arrivé à Les Marches;
  2. when a name is placed after onto: the correct translation of Merge left onto Avenue des Champs-Élysées is S’insérer à gauche sur l’avenue des Champs-Élysées, not S’insérer à gauche sur Avenue des Champs-Élysées: the second case have a faulty capital and misses the required article la, which is elided in l’ in this case.
yuryleb commented 6 years ago

@Penegal, I need more detailed cases for replacing à to aux in Vous êtes arrivés à {waypoint_name} strings :wink: Or just aux should be used everywhere with {waypoint_name}?

Penegal commented 6 years ago

@yuryleb: à le is to be replaced by au, and à les by aux. This is not related to the first letter, as with l’, but to the first word; it is to be applied even if le or les is in the variable instead of the OSRM string, and is case insensitive: à Le is also to be replaced by au, and à Les by aux. You can see an example in my last comment.

yuryleb commented 6 years ago

Example of rotary: french rotary

Penegal commented 6 years ago

Here is what I expected: the correct second instruction would be Prendre le rond-point de la place Charles de Gaulle. It should change according to the name of the roundabout; if it is named like Rond-point x, then the correct sentence would be Prendre le rond-point x, but, if it is another name, it should use the new elision rules: Place x would give Prendre le rond-point de la place x.

yuryleb commented 6 years ago

@Penegal, now Transifex's recently changed uturn/name phrase becomes Faire demi-tour à la fin de la route {way_name:article} not Faire demi-tour à la fin {way_name:preposition} as early, that's no de/du/des preposition will be inserted before {way_name} after grammar processing - is it correct or override script should be also updated?

Penegal commented 6 years ago

@yuryleb: you should go with Faire demi-tour à la fin {way_name:preposition}, it's the correct sentence.

Penegal commented 6 years ago

@yuryleb: I just checked, and, according to an orthographic reform, rond-point can also be written rondpoint.

Also note that I recently updated some Transifex strings; in case of conflict, you can drop the new ones. Just warn me to allow me to restore them.

yuryleb commented 6 years ago

Yes, I can support both but what should be in final text - rondpoint or rond-point?

And actually I could also merge your changes from Transifex if you completed - anyway I download them every time I check how override script works :wink:

Penegal commented 6 years ago

The main orthograph is rond-point; the other one is controversial and most people would see it as an error.

If you can merge the Transifex changes with the one you did, perfect! I have a backup of the first ones, if needed, and I finished my edits, so you can go and merge.

Penegal commented 6 years ago

I noticed you added way types in your regex list; there could also be le sentier, which is something like a feminine form of sente.

yuryleb commented 6 years ago

It seems all is done. @Penegal, @benjamintd, do you have anything to add?

Penegal commented 6 years ago

@yuryleb: de/du/des/le/la/l’ should not use a bold typeface unless they are part of the name/destination, as, in this case, they are technically not part of it.

Edit: also, village has 2 L.

yuryleb commented 6 years ago

@Penegal, village has two L in expressions, it was just misprint in commit comment.

Formatting with bold typeface is part of external formatToken() implementation inside OSRM sample frontend (see https://github.com/Project-OSRM/osrm-frontend/blob/gh-pages/src/itinerary_builder.js#L11). Perhaps I could extend my last formatToken() fix with your de/du/des/le/la/l’ formatting proposal.

yuryleb commented 6 years ago

@Penegal, so an expression to exclude appended articles/prepositions should look like ^(à )|(au )|(aux )|(d’)|(de )|(des )|(du )|(l’)|(la )|(le )|(les ) (case-sensitive to distinct appended words from existing in name)?

This expression will help to format as below:

Is it enough or all added words should be also non-bold to keep only cour d’Honneur bolded?

Penegal commented 6 years ago

@yuryleb: I would say the correct typography would be de la cour d’Honneur.

yuryleb commented 6 years ago

@Penegal, please look how this is now implemented.

Penegal commented 6 years ago

@yuryleb: very good work. I'm OK with this.

Penegal commented 6 years ago

Is there a problem preventing an approving review?

Penegal commented 6 years ago

How often is the code of map.project-osrm.org updated? Just wanted to see it live… :grin:

1ec5 commented 6 years ago

map.project-osrm.org is powered by OSRM-frontend. That repository will need to be updated, but first I have to publish a new version of OSRMTI.