allo-media / text2num

Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.
https://text2num.readthedocs.io
MIT License
102 stars 47 forks source link

Issue with zero followed by characters resulting in dropped data #114

Open Masame opened 5 months ago

Masame commented 5 months ago

testing_text_to_num = alpha2digit("""Please call me at one two three four five six seven eight nine zero in reference to ticket C F zero three two zero seven eight two""", 'en', ordinal_threshold=0)

The expected result would have been: 'Please call me at 1 2 3 4 5 6 7 8 9 0 in reference to ticket C F 03 2 07 8 2' or ''Please call me at 1 2 3 4 5 6 7 8 9 zero in reference to ticket C F 03 2 07 8 2'

The actual result was: 'Please call me at 1 2 3 4 5 6 7 8 9 in reference to ticket C F 03 2 07 8 2'

If there is punctuation after the 'zero' it converts it to int just fine, if you remove the 'in' after 'zero' it converts it to an int as well. I get why it's having trouble (#42), but it should not be dropping the data.