allo-media / text2num

Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.
https://text2num.readthedocs.io
MIT License
102 stars 47 forks source link

[GER] ein, einen, einem not converted properly #85

Closed agademic closed 1 year ago

agademic commented 1 year ago

Hi! First of all, thank you for your great library! In general everything works really well. But here is a case where written numbers are not converted properly in German, especially numbers indicating a "one". Here are some examples:

alpha2digit("einem Getränk", "de") - > "einem Getränk" alpha2digit("ein Getränk", "de") - > "ein Getränk" alpha2digit("eine Getränk", "de") - > "eine Getränk" alpha2digit("einen Getränk", "de") - > "einen Getränk" alpha2digit("eines Getränk", "de") - > "eines Getränk" --> Expected in all cases: "1 Getränk"

What works is alpha2digit("eins Getränk", "de") - > "1 Getränk" but that's the only case.

I know German (grammatical) cases are somewhat nasty. In English just say "one drink" in every (grammatical) case. But this is something which would be really awesome if your library could handle them. Or am I missing something?

Anyway, thank you for any tips!

rtxm commented 1 year ago

In all the languages we support, the indefinite article is not considered a ordinal by alpha2digit.

alpha2digit("un chien", "fr")
"un chien"
alpha2digit("a dog", "en")
"a dog"
# and even
alpha2digit("one dog", "en")
"one dog"
# but
alpha2digit("one two three", "en")
"1 2 3"

The purpose of the provided alpha2digit function is to make text easier to read for humans. Follow #42 for a feature extension to "force" conversion of articles.