Closed jayaddison closed 4 years ago
One possible idea here is to create 'disambiguation expansions'.
steak
would map to beef
in a disambiguation expansion.
We could then ensure that each set of expansions is anchored to a parent that contains some of the same dis-exp tokens, with a preference for 'original' tokens over ones added at expansion time.
Therefore in the third example listed above, the tuna steak would prefer to be anchored to a root with 'tuna' as a token.
Perhaps the existing contents
token generation could be re-used for this case?
While the suggested expansions approach may work, one drawback it has is that it may become difficult to reason about and debug.
An alternative approach would be to create a 'remappings' file that contains manual overrides for the parent element for specific product IDs. For example, we might want to override product A to have no parent element (i.e. it would become a root product), or we may wish to override product C so that it has product B as a parent.
Selectively re-indexed affected recipes using crawler
command:
crawler/reciperadar $ pipenv run python recipes.py --where "exists (select * from recipe_ingredients as ri where ri.recipe_id = recipes.id and (ri.description ilike '%tuna steak%' or ri.description ilike '%halibut steak%' or ri.description ilike '%root beer%'))" --recrawl
Describe the bug Because the logic to determine the ingredient hierarchy is currently naive (it is searching simply based on the name of the parent ingredient),
tuna steak
is being categorized as a sub-ingredient ofsteak
.This means that a query for
steak
will return recipes containingtuna steak
, and that queries fortuna
will not return recipes containingtuna steak
.To Reproduce Steps to reproduce the behavior:
parent_id
fieldExpected behavior The
parent_id
ofsteak_tuna
should betuna
.Relates to https://github.com/openculinary/backend/issues/24