savoirfairelinux / num2words

Modules to convert numbers to words. 42 --> forty-two
GNU Lesser General Public License v2.1
823 stars 496 forks source link

String containing mixed text and numerals should work. #281

Open aaronrudkin opened 5 years ago

aaronrudkin commented 5 years ago

Expected Behaviour

Combinations of strings and numbers should convert as expected. For instance:

num2words("text 1") should return "text one"

Actual Behaviour

"decimal.invalidOperation" exception is raised.

Steps to reproduce

Call num2words with any mix of characters and numerals.

It would seem to me the general approach here should be that if a string cannot be directly converted to decimal, the library should regex the numbers, extract them, convert them, and re-inject them. I was a little surprised because this to me would be a major use case of the library. In my particular case I was trying to canonicalize user searches.

Solution Sketch

I ended up doing this (which is certainly not a robust solution, and is English-language specific, etc. etc. but is a sort of rough template of what I had in mind). I am sure there are a number of edge cases that this doesn't handle properly, but just the same...

def num_wrapper(text):
    """ Wraps num2words to allow mixed text-numeric types """
    return re.sub(r"(([0-9]+[,.]?)+([,.][0-9]+)?)", num_wrapper_inner, text)

def num_wrapper_inner(match):
    """ Inner wrapper feeds the string from the regex match to num2words """
    return num2words(match.group())

num_wrapper("test 11... more 2000.95 numbers... 9-1-1")
# 'test eleven.. more two thousand point nine five numbers... nine-one-one'
khuang0312 commented 4 years ago

May I take a stab at this issue?

mromdhane commented 3 years ago

@aaronrudkin yes it is interesting, @khuang0312 we can work on it.

thiborose commented 1 year ago

Hello, has anyone been working on this? I am also interested.