allo-media / text2num

Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.
https://text2num.readthedocs.io
MIT License
102 stars 47 forks source link

alpha2didgit don't work with "one day" #42

Open Krozark opened 3 years ago

Krozark commented 3 years ago

The lib is pretty cool, but I find a bug with a particular sentence:

alpha2digit("one day", "en") 'one day

Expected : "1 day"

The same is true in french (initial test in fact)

rtxm commented 3 years ago

The current algorithm is unable to make the difference between the number ("one day") and the pronoun ("the one, each one, another one, this one, etc…"). In French it can't do the difference between "un" (number) = "one" and "un" (article) = "a" or "un" pronoun ("j'en veux un"). So, for such words, we don't replace by default, except if another number is next (see unit tests).

Krozark commented 3 years ago

Ok, that make sens. Thanks for you reply. Maybe an option to force the replace for those cases could be added

rtxm commented 3 years ago

Yes. Let's tag this as feature request then.