RosaeNLG / rosaenlg

RosaeNLG is a Natural Language Generation library for node.js and browser rendering, based on the Pug template engine.
https://rosaenlg.org
Apache License 2.0
91 stars 21 forks source link

[BUG] Two word contractions aren't applied properly to capitalized definite articles in French #196

Closed bdbernardy closed 1 year ago

bdbernardy commented 1 year ago

Describe the bug Some French city names have a definite article: "Le Creusot", "Le Mans", "Les Mureaux", etc. Contractions aren't applied properly to to these proper nouns because of the capitalized "Le" or "Les".

"Je vais à Le Creusot" should be contracted to "Je vais au Creusot" for example.

Expected behavior The following two word combinations should be contracted as follows:

Code The problem is in these lines of code.

We need to add the following 4 elements to the two-word contraction rules:

['de', 'Le', 'du'],
['de', 'Les', 'des'],
['à', 'Le', 'au'],
['à', 'Les', 'aux'],

Context:

ludans commented 1 year ago

Thank @bdbernardy for spotting the issue and the lines to change. There is a small potential issue like in:

Ce principe permettra à Le Corbusier...

Which will be transformed as well. But let's consider that names should be properly protected with the adequate mixin.

I'll probably do what you propose.

ludans commented 1 year ago

Thanks for the PR. Will be released in > 4.2.0. Tell me if you need an early release (otherwise I will release in 1 or 2 months among other changes).

bdbernardy commented 1 year ago

Thanks for the PR. Will be released in > 4.2.0. Tell me if you need an early release (otherwise I will release in 1 or 2 months among other changes).

Hi Ludans,

I was happy to help! 1-2 months is fine. Thank you for asking.