Project-OSRM / osrm-text-instructions

Text instructions from OSRM route responses
BSD 2-Clause "Simplified" License
86 stars 60 forks source link

i can't understand about french grammar [help] #308

Closed mrtnetwork closed 1 year ago

mrtnetwork commented 1 year ago

Hi, when I use normal translation to translate navigation instruction for French give me some result like

Place de l'Hôtel de Ville

but with grammar, I got results like

la place de $1'Hôtel de Vill

what is $1? Thanks

1ec5 commented 1 year ago

That’s unexpected. You should be seeing the “l’” article come through untouched. Are you using the usual compile() function, and are you using any options such as formatToken?

I think the $1 is coming from the following line in the French “grammar” file:

https://github.com/Project-OSRM/osrm-text-instructions/blob/6a77ef9be61b6f3b659177b47e407ccb87b63975/languages/grammar/fr.json#L58

The following code is supposed to replace the $1 on the right side with the “d” or “l” captured in the left side:

https://github.com/Project-OSRM/osrm-text-instructions/blob/6a77ef9be61b6f3b659177b47e407ccb87b63975/index.js#L254-L260

mrtnetwork commented 1 year ago

That’s unexpected. You should be seeing the “l’” article come through untouched. Are you using the usual compile() function, and are you using any options such as formatToken?

I think the $1 is coming from the following line in the French “grammar” file:

https://github.com/Project-OSRM/osrm-text-instructions/blob/6a77ef9be61b6f3b659177b47e407ccb87b63975/languages/grammar/fr.json#L58

The following code is supposed to replace the $1 on the right side with the “d” or “l” captured in the left side:

https://github.com/Project-OSRM/osrm-text-instructions/blob/6a77ef9be61b6f3b659177b47e407ccb87b63975/index.js#L254-L260 Hi, thanks for your answer, I use dart language for this my tokenize method


if (grammers != null) {
final out = output.replaceAllMapped(
RegExp(
r"\{(\w+)(?::(\w+))?\}",
caseSensitive: false,
),
(match) {
final String? token = match[0];
String? tag = match[1];
String? grm = match[2];
String? value = tokens[tag];
if (value == null) {
return token!;
}
value = grammerReplace(value, grm, grammers);
return value;
},
).replaceAll(RegExp(r' {2}'), ' ');
return RoadHelper(out, icon);
}else{
tokens.forEach((key, value) {
output = output.replaceAll('{$key}', value);
});
}
grammerReplace

static String grammerReplace(String name, String? grammar, Map<String, dynamic> grammers) { if (grammar == null) return name; List? rules = grammers["v5"]?[grammar]; if (rules == null) return name; String n = ' $name '; for (var rule in rules) { RegExp re = RegExp("${rule[0]}", unicode: true, caseSensitive: true); n = n.replaceAll(re, rule[1]); } return n.trim(); }

1ec5 commented 1 year ago
n = n.replaceAll(re, rule[1]);

According to the replaceAll documentation, you’ll need to use replaceAllMapped. I’m not sure if it understands the $n syntax out of the box, or if you’d have to write some code to parse it yourself.

I didn’t realize you were using the grammar data in a port of this library. There’s actually a spot in the readme for promoting ports, in case you’re interested. 🙂

https://github.com/Project-OSRM/osrm-text-instructions/blob/6a77ef9be61b6f3b659177b47e407ccb87b63975/README.md#L9

mrtnetwork commented 1 year ago
n = n.replaceAll(re, rule[1]);

According to the replaceAll documentation, you’ll need to use replaceAllMapped. I’m not sure if it understands the $n syntax out of the box, or if you’d have to write some code to parse it yourself.

I didn’t realize you were using the grammar data in a port of this library. There’s actually a spot in the readme for promoting ports, in case you’re interested. 🙂

https://github.com/Project-OSRM/osrm-text-instructions/blob/6a77ef9be61b6f3b659177b47e407ccb87b63975/README.md#L9

thank you so much, I think now is fixed I thought this was a simple replacement, I'm not too familiar with how regex works, thanks for the tip

 static String grammarizeTest(String name, String? grammar, Map<String, dynamic> grammers) {
    if (grammar == null) return name;
    List<dynamic>? rules = grammers["v5"]?[grammar];
    if (rules == null) return name;
    String flags = grammers["meta"]?["regExpFlags"] ?? "";
    bool isCaseSensitive = !flags.contains("i");
    String n = ' $name ';
    for (var rule in rules) {
      RegExp re = RegExp("${rule[0]}", caseSensitive: isCaseSensitive);
      n = replace(n, re, rule[1]);
    }
    return n.trim();
  }

  static String replace(String inputText, RegExp pattern, String strReplacePattern) =>
      inputText.replaceAllMapped(pattern, (match) {
        var replacedString = strReplacePattern;
        for (var i = 0; i <= match.groupCount; i++) {
          if (match.group(i) == null) continue;
          replacedString = replacedString.replaceAll('\$$i', match.group(i)!);
        }
        return replacedString;
      });

log

String test = grammarizeTest(value, grammar, grammers);
          final currentValue = value;
          value = grammarize(value, grammar, grammers);

          if (test != value) {
            print("is not same v: $value t: $test c: $currentValue");
          } else {
            print("is same");
          }

Output:

is not same v: de $1iry-Châtillon t: de Viry-Châtillon c: Viry-Châtillon