Project-OSRM / osrm-text-instructions

Text instructions from OSRM route responses
BSD 2-Clause "Simplified" License
90 stars 63 forks source link

Grammar rules: limits? #251

Open Penegal opened 6 years ago

Penegal commented 6 years ago

Hello, there.

Making some tests with OSRM in French, I noticed it had no support of elision, which is particularly perceptible in this language: OSRM will, for instance, say Tourner légèrement à droite en direction de Épinal, whereas the correct French would be Tourner légèrement à droite en direction d’Épinal. That would need adaptation of translated text according to rules applied on the OSM names, but would this be possible using grammar rules?

Another related question: in some cases, OSRM will say, for instance, Tourner à gauche sur Rue de Jarménil (D 159b), but the correct French sentence would be Tourner à gauche sur la rue de Jarménil (D 159b), with a definite article, as the way name doesn't have one, and a lower case first letter. That would greatly increase readability and user-friendliness of instructions, but would it be possible and desirable?

I'm willing to help for these, at least for the regex part, as it is my first time on this project and I'm unsure I'm able to completely write such a feature from scratch.

Regards.

yuryleb commented 6 years ago

Yes, this looks possible and desirable. I could propose to add some (say elision) "grammar case" for French, add this elision option to all way_name keywords and prototype a few expressions for elisions and articles. I hope this could work.

yuryleb commented 6 years ago

Actually your first issue requires changing whole source phrase not way_name value only and so this is out of grammar rules functionality :frowning:

Maybe it's possible to apply elision right in source phrase like below? "destination": "Tourner à droite en direction d’{destination}"

BTW also French translation on Transifex looks unsynchronized with current languages/translations/fr.json content that makes working with new French override script to add grammar options much harder. Maybe you can fill missing translations on Transifex first or maybe @benjamintd, @patjouk, @guillaumerose could help with this?

Penegal commented 6 years ago

I don't think it will be possible to apply elision directly in source phrases, at least not always, as it only applies before vowels and some words starting with a h. It would need a bit of regex to detect on which cases apply elision.

yuryleb commented 6 years ago

OK, then we need for another bunch of rules to post-process whole final phrase content. Actually this is necessary also for Russian (#240) :wink:

benjamintd commented 6 years ago

@yuryleb would it make sense to add the article (de or d') inside the way name grammar case?

We would have something like: "destination": "Tourner à droite en direction {destination:article}" which would give:

But then I don't know what to do with cases like:

If we don't restrict ourselves to fit the grammar JSON file model, we can consider @yuryleb 's solution to post-process the whole sentence.

Penegal commented 6 years ago

@benjamintd: you could detect such cases with case-insensitive regex on the destination string with the following order used as with a switch (the first matching case stops evaluation):

  1. if it begins with le, use du instead of de and remove the first word of the destination string;
  2. if it begins with les, use des instead of de and remove the first word of the destination string;
  3. if it begins with la, downcase the first letter of the destination string;
  4. if it begins with rue, use de la and downcase the first letter of the destination string;
  5. if it begins with avenue, use de l’ and downcase the first letter of the destination string;;
  6. [other street classifications here]
  7. if it begins with a vowel or an elision h, use d’ instead of de.

This takes into account removing useless capital first letters, which in French are to be used only for start of sentences and proper names, and therefore are currently a disturbance when reading OSRM instructions: Tourner à droite sur la Rue de Jarménil is mostly an error, as the correct sentence would be Tourner à droite sur la rue de Jarménil.

yuryleb commented 6 years ago

@benjamintd, actually it's the great idea just to "move" prepending article into way_name and/or destination - then we can work inside current grammar rules model :+1: @Penegal, I suppose all these rules could be easily reflected into regular expressions.

I already prototyped and published first dummy implementation of French "grammar", just give me some time to change it accordantly to your proposals.

yuryleb commented 6 years ago

@Penegal, @benjamintd, please review my second commit in #252, especially the proposed status street names list (I collect them early for translation for my Garmin Russian TTS voices project).

I had to add new elision rules specially for destination keyword handling to insert proper du/des/de article/preposition. Please note that now this de is not necessary before {destination} in Transifex translations and could be removed there.

And actually I didn't yet test routes with destination - do you know the places where these destination are filled on roads/junctions?

Penegal commented 6 years ago

@yuryleb: maybe we should wait for your PR to be merged before updating translations?

yuryleb commented 6 years ago

Yes, since we started to change French translation strings too, it's better to finish the PR first. Fortunately Transifex seems to be able to upload translation JSON back (just don't forget to remove :article and other grammar options we added before upload - they will be inserted by languages\overrides\fr,js script automatically).

Penegal commented 6 years ago

@yuryleb: you can try on these interchange:

I worked on them this morning, so, if map.project-osrm.org uses the OSM.org DB, you should be able to test on them. BTW, I saw cases where the lane assist wasn't working though the turn:lanes data were available? Is there a condition for lane assist to be enabled, as a highway=motorway_junction node at the ways connection, or are the map.project-osrm.org data behind the OSM.org DB?

yuryleb commented 6 years ago

It seems destination processing also works: french destination1 french destination2 french destination3 Incorrect turn:lanes processing is mostly osrm-backend issue as on last screenshot.