allo-media / text2num

Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.
https://text2num.readthedocs.io
MIT License
102 stars 47 forks source link

[PT-BR] Some numbers are not being recognize #79

Closed RafaelMRazeira closed 1 year ago

RafaelMRazeira commented 2 years ago

Hi there! Before all, great work!!! This lib helps a lot <3

As it happens with Spanish, in "pt" there some numerals which the alpha2digit function is not recognizing. Here is some examples:

_text = "dezenove" alpha2digit(_text, "pt") expected: 19 return: "dezenove"

To reproduce just create a env from zero and install text2num==2.4.0.

Until now I found those numbers:

In the case of "um" I see this issue for "ones" problems, but in Portuguese I don't think this happens...

Some prints to exemplify better: image image

falcaopetri commented 2 years ago

@RafaelMRazeira, support to 19, 17 and 16 was added in #73. These modifications were not released in PyPi yet.

You can install from upstream till then:

$ pip install -U --force-reinstall https://github.com/allo-media/text2num.git

@rtxm I would also love to have the newer improvements from upstream in a release.

Regarding parsing "um" (1), I'd argue that Portuguese suffers from same ambiguity then English/French (#42). Take as an example this sentence: "tome como um exemplo essa sentença".

rtxm commented 1 year ago

2.5.0 Released!