Closed Manish-210 closed 3 years ago
I would actually expect parse_number("half")
to return Decimal('0.5')
. Maybe we could have a separate parse_fraction
for 1/2
?
parse_fraction
seems good, I'll give a try to implement it.
Some points to notice for the parse_fraction
function :
fractions have several ways to spell. For example:
3/4
as three fourths, three over fourth, three divided by fourth, or simply three by fourth.
In which the most commonly used is three fourths and I guess it is sufficient to implement a function for these types now.
after the number 3 the pattern is denominator with s, like three thirds ( 3/3
), five fourths (5/4
) but for denominator with 1 spells wholes and with 2 spells halves like four wholes ( 4/1
), two halves (2/2
).
A small detail: if the numerator is greater than one then the denominator is plural otherwise it is singular like one whole (1/1
), one half (1/2
) which is obvious.
I have written the parse_fraction
function given below
` def parse_fraction(input_string, language=None):
"""Converts a single number written in fraction to a numeric type"""
if not input_string.strip():
return None
if language is None:
language = _valid_tokens_by_language(input_string)
if language != 'en':
raise ValueError(f'"{language}" is not a supported language for parsing fraction')
fraction_separators = ["divided by", "over", "by", "/"]
for separator in fraction_separators:
position_of_separator = input_string.find(separator)
if position_of_separator == -1:
continue
string_before_separator = input_string[:position_of_separator]
string_after_separator = input_string[position_of_separator + len(separator):]
number_before_separator = parse_number(string_before_separator, language)
number_after_separator = parse_number(string_after_separator, language)
if number_after_separator == None or number_after_separator == None:
return None
return str(number_before_separator) + '/' + str(number_after_separator)
return None
` Few things to mention:
fraction_separator
in this function which should be at some other place and then we have to import it so that it can work for all languages, It is only working for the English language currently.3/4
( fractions that end with 's' are not converted )I think it’s OK to provide an imperfect initial implementation, it can be improved over time based on feedback.
And yes, I think it would be best to include tests.
Yeah, Added the test data and sent a Pull request for this. Should I mark this as an improvement or close this issue?
I’ve updated the description of https://github.com/scrapinghub/number-parser/pull/60 so that it closes this issue automatically once it is merged.
We can add support for the simple fractions as well like three-fourth, half, etc. Example: