facebook / duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Other
4.05k stars 719 forks source link

Reject common phrases like "guten morgen" #729

Open emlautarom1 opened 5 months ago

emlautarom1 commented 5 months ago

In German, "morgen" is used for "tomorrow" and there is a rule for that:

https://github.com/facebook/duckling/blob/7520daaeba28691cda8e1b5c3d946028a28fb64b/Duckling/Time/DE/Rules.hs#L40

Unfortunately, this means that phrases like "guten morgen" are interpreted as "tomorrow", despite it being a common phrase that has nothing to do with time. I would like to add some kind of rule that detects this pattern and always rejects it.

My initial attempt used negative lookbehinds like (?<!guten\s+)morgen, but this is not supported by the engine. A second approach was to create a rule that matched on the guten\s+morgen regex and has a production of const Nothing, but this does not work either because the morgen rule gets used instead.

How can I force Duckling to always ignore this phrase?