ephes / kptncook

Fetch receipts from kptncook API and export them to mealie or paprika
MIT License
42 stars 4 forks source link

handling non-parsable recipes #23

Closed luebbert42 closed 2 months ago

luebbert42 commented 1 year ago

Hi Jochen,

I came across the recipe with the id 63f48c995000009a054010c4

Adding it to the local repo works fine, but get_kptncook_recipes_from_repository cannot handle it, so the local repo and the functionality which needs the local repo won't work any longer.

I was wondering how to fix that - trying to parse it before adding it to the local repo, skip the recipe with a try/except in the get_kptncook_recipes_from_repository function or trying to fix the json (probably mission impossible, as we never know what might be wrong in the future).

Any thoughts on this?

P.S. error message for the recipe mentioned above is:

CRITICAL:root:1 validation error for Recipe ingredients -> 1 -> ingredient -> uncountableTitle field required (type=value_error.missing)

but I think of it more as a general problem.

ephes commented 1 year ago

Hi Dorthe,

thanks for your report 😀. I just managed to reproduce the error and this definitively shouldn't happen. I added this snippet to show which ingredient might be the culprit here:

def get_kptncook_recipes_from_repository() -> list[Recipe]:
    fs_repo = RecipeRepository(settings.root)
    recipes = []
    for repo_recipe in fs_repo.list():
        try:
            recipes.append(Recipe.parse_obj(repo_recipe.data))
        except Exception as e:
            print(f"Could not parse recipe {repo_recipe.id}: {e}")
            for ingredient in repo_recipe.data.get("ingredients"):
                uncountable_title = ingredient["ingredient"].get("uncountableTitle")
                if uncountable_title is None:
                    print("ingredient: ", ingredient["ingredient"])
            return []
    return recipes

Ok, it's flour and the uncountableTitle is just missing. Hmm, maybe we can just allow this title to be None in the Recipe-Model? I'm not sure how this attribute is used. But in general we probably should only add recipes to the repository we are able to parse again later on 😄.

So we probably need two new tests:

  1. Make sure that an ingredient missing uncountableTitle can be parsed
  2. Make sure that only recipes are added which can be parsed later on (maybe think about this again, there might be a reason why this isn't the case now)

Do you want to give it a try?

Best, Jochen

luebbert42 commented 1 year ago

Hi Jochen, yes I will look into that. I am currently rewriting my little app for "raffle based cooking events", but I am sure I will find some time on that task as well :-) I am still thinking about if there should be an option to download malformed recipes if you really really want to have them and store the json somewhere. I agree that adding broken recipes to the local repo is not good though. Best, Dorthe

ephes commented 1 year ago

Hi Dorthe,

thanks a lot. An app for cooking events, hmm, ok I'm curious 😁. Hmm, storing a recipe that you cannot parse again might make sense if some change in the future makes it parsable, maybe that's the case here. Another option would be to ask the user to "heal" the recipe by providing a required attribute (because we know which one is missing), but this is probably a lot of work. But maybe letting the user choose might be the best option here. Some dialog like this?

  1. Save recipe anyway
  2. Provide the required value manually
  3. Discard the option

Best, Jochen

luebbert42 commented 1 year ago

Hi Jochen,

This seems to be the absolute exception that incorrect json is delivered. I therefore suggest a minimal fix with a root validator in the model:


class IngredientDetails(BaseModel):
    typ: str
    localized_title: LocalizedString
    uncountableTitle: LocalizedString
    category: str

    @root_validator(pre=True)
    def fix_json_errors(cls, values):
        if 'uncountableTitle' not in values or values['uncountableTitle'] is None:
            if 'numberTitle' in values:
                values['uncountableTitle'] = values['numberTitle']
        return values

    class Config:
        alias_generator = to_camel

What do you think, too hackish?

Best, Dorthe

P.S. The cooking app mentioned above is basically a lottery: we usually play in teams of two, I configure how many courses and which countries/topic are part of the cooking event (normally we take the whole list of about 250 countries from all over the world, including some historic onces, like GDR). Then each team gets a random country and a course assigned by the app. Then you have to do some research on the country and what people eat there and pick a recipe. It is a big secret what country and dessert you have.

Then we meet and each team cooks its course. Then the national anthem is played and the other people have to guess the country (but guessing is also allowed during cooking). Sometimes it kind of escalates and people bring drinks from "their country" or do a little talk about it or wear local clothes. ;-)

Last time we had some starter from Turks and Caicos Island, a main course from Pakistan and a very crazy selection of desserts from Japan.

ephes commented 1 year ago

Hi Dorthe,

this is probably ok as long as there's a test for this so the behavior is documented. Something like this maybe?

def test_ingredient_details_without_uncountable_title():
    ingredient_details = {
        "typ": "ingredient",
        "localizedTitle": {"de": "Zucker", "en": "sugar"},
        "numberTitle": {"de": "Zucker", "en": "sugar"},
        "uncountableTitle": None,
        "category": "baking",
    }
    ingredient_details = IngredientDetails(**ingredient_details)
    assert ingredient_details.uncountable_title == ingredient_details.number_title

And then The model becomes:

class IngredientDetails(BaseModel):
    typ: str
    localized_title: LocalizedString
    number_title: LocalizedString
    uncountable_title: LocalizedString | None
    category: str

    @root_validator(pre=True)
    def fix_json_errors(cls, values):
        if 'uncountableTitle' not in values or values['uncountableTitle'] is None:
            if 'numberTitle' in values:
                values['uncountableTitle'] = values['numberTitle']
        return values

    class Config:
        alias_generator = to_camel

The camelCase uncountableTitle was an accident from a previous PR 😅. I just added you as a collaborator to this repository so you can just go ahead and change the code.

Your cooking events seems to be a lot of fun. I imagine it could be difficult getting all the ingredients, but this could also be part of the fun. Really cool idea 😁.

best Jochen

ephes commented 2 months ago

Just skipping invalid recipes for now, maybe it can be reopened later for a cleaner solution.