scrapinghub / number-parser

Parse numbers written in natural language
BSD 3-Clause "New" or "Revised" License
109 stars 23 forks source link

Add a `parse_number()` function #7

Closed noviluni closed 4 years ago

noviluni commented 4 years ago

I think it could be really useful to have a function called parse_number() that expects a written number and returns the equivalent number (Python number):

>>> parse_number("one")
1

>>> parse_number("twenty six")
26

It's easy to imagine use cases, and it will be useful to refactor the current code.

With the currently implemented logic, it shouldn't be difficult to create it by using the number_builder.

>>> number_builder(["twenty", "six"])                                                                                                                                                 
['26']

>>> number_builder(["one"])                                                                                                                                                           
['1']

We should decide what to do if we pass something else. I think we could return just None.

>>> parse_number("one second")
None

>>> parse_number("twenty six cars")
None

Of course, this will be multi-locale and it would be useful to accept a locales or locale argument:

>>> parse_number("uno", locales=["en"])
None

>>> parse_number("uno", locales=["es"])
1
noviluni commented 4 years ago

Hei @arnavkapoor, I think that we could prioritize this as it will help to define a better structure for the parser.py.

noviluni commented 4 years ago

By the way, I think that: parse_number("1") should return 1 too.

arnavkapoor commented 4 years ago

Sure @noviluni I will start working on this, just confirming even multiple numbers in the same string things should return None.

> parse_number("one two")
None

I will create a PR in a while, however not very sure how exactly do we use it within the main parse function for reusability.

noviluni commented 4 years ago

even multiple numbers in the same string things should return None.

I think so.

not very sure how exactly do we use it within the main parse function

I think that we could extract part of the logic currently inside the number_builder() in another function to be reused by parse_number(). In fact, the logic would be pretty the same except for:

However, I prefer you to code something and discuss it through the PR, as it will be easier for both to see how it can be applied / refactored. :smile:

arnavkapoor commented 4 years ago

Hi @noviluni sure I was wondering about the feasibility of the main parse() function using this parse_number(). But of course, they both can use the common number_builder(). This structure is what I went with for the current PR #9.

noviluni commented 4 years ago

Hi @arnavkapoor! As we have merged the PR I think you can close this issue. :muscle: