strangetom / ingredient-parser

A tool to parse recipe ingredients into structured data
https://ingredient-parser.readthedocs.io/en/latest/
MIT License
69 stars 11 forks source link

Rounding ingredient quantities #24

Open Millerlm012 opened 1 week ago

Millerlm012 commented 1 week ago

@strangetom When the ingredient quantity is a float, have you thought about not rounding the decimal? I'm using your library (thank you, it's been great so far!), but I'm looking to convert the float values back to fractions and am requiring a higher precision.

I see in the code you round to the thousandth place, which is good for most cases except for repeating decimals. I use a stern-brocot tree to convert back, but it requires decimal precision to go out 5+ digits. I'd hate to fork off of your repo for something as silly as rounding. Not rounding would allow the caller of the library to decide what should be done with the parsed value. Thoughts?

strangetom commented 1 week ago

Hi @Millerlm012

This sounds like a sensible request. I'll have to have a think about how to make sure quantities like 1 1/2 still get treated as a single token, which converting everything to a float was doing, but I don't think it'll be too difficult.

Millerlm012 commented 1 week ago

I think you can and should still convert everything to a float. If you don't return a float anymore, that'd be a breaking change your users won't be expecting. For cases like 1 1/2, 1.5 would be returned (as it already does). The edge case of 1 1/3 would return 1.3333333333333333 instead of the 1.333.

I haven't pulled and tested, but I'd assume you'd just need to remove the round() in following locations:

strangetom commented 1 week ago

I was thinking of adding an option to return quantities as fractions.Fraction objects instead of floats. The current implementation using floats would remain the default to avoid breaking changes.

Using Fraction objects means we can represent fractions like 1/3 exactly and leave it up end users of the library to decide on how they want to handle them.

Millerlm012 commented 1 week ago

Yeah, that'd work as well! Let me know if you need any assistance.

strangetom commented 5 days ago

I think I've got this done. It will be available in the next release, but is available on the develop branch now if you want to test it.

Default behaviour

>>> parse_ingredient("1/3 cup olive oil")
ParsedIngredient(name=IngredientText(text='olive oil',
                                     confidence=0.999714,
                                     starting_index=2),
                 size=None,
                 amount=[IngredientAmount(quantity=0.333,
                                          quantity_max=0.333,
                                          unit=<Unit('cup')>,
                                          text=' 1/3 cups',
                                          confidence=0.999915,
                                          starting_index=0,
                                          APPROXIMATE=False,
                                          SINGULAR=False,
                                          RANGE=False,
                                          MULTIPLIER=False,
                                          PREPARED_INGREDIENT=False)],
                 preparation=None,
                 comment=None,
                 purpose=None,
                 foundation_foods=[],
                 sentence='1/3 cup olive oil')

Quantity fractions

>>> parse_ingredient("1/3 cup olive oil", quantity_fractions=True)
ParsedIngredient(name=IngredientText(text='olive oil',
                                     confidence=0.999714,
                                     starting_index=2),
                 size=None,
                 amount=[IngredientAmount(quantity=Fraction(1, 3),
                                          quantity_max=Fraction(1, 3),
                                          unit=<Unit('cup')>,
                                          text=' 1/3 cups',
                                          confidence=0.999915,
                                          starting_index=0,
                                          APPROXIMATE=False,
                                          SINGULAR=False,
                                          RANGE=False,
                                          MULTIPLIER=False,
                                          PREPARED_INGREDIENT=False)],
                 preparation=None,
                 comment=None,
                 purpose=None,
                 foundation_foods=[],
                 sentence='1/3 cup olive oil')

When quantity_fractions=True, the quantity and quantity_max fields for all IngredientAmount objects will be a Fraction object, even if the value is an integer (e.g. Fraction(1, 1) for 1) or the fraction is greater than 1 (e.g. Fraction(3, 2) for 1.5).

Whenever a v2 of this library comes around and I can make breaking changes, I'm thinking of making Fraction the default quantity type and removing the option to switch to float.

Millerlm012 commented 3 days ago

This is awesome, nice work! I'll pin the package import in my code to the develop branch until you get this merged into main. I'll let you know if I run into anything in testing!