Describe the reason for these changes and the problem that they solve
This changeset adds a reference to the ingreedy-data dataset, which contains consolidated nutritional information from multiple sources including the UK CoFID (aka 'McCance') and the US FoodData Central databases.
In order to perform matching between the RecipeRadar dataset and ingreedy-data, we use the normalized_name field from the latter's consolidated JSON format and build a search index using the names as documents. For each named ingredient in RecipeRadar, a query is performed on the search index and the best-matching result is selected.
If no matches are found and the RecipeRadar ingredient has a 'parent' (i.e. tofu is the parent of firm tofu), then nutritional information from the parent element is used as a fallback where present.
Matching accuracy hasn't yet been scrutinized or quantified and this algorithm will likely require further development and improvements.
Briefly summarize the changes
Add a git submodule reference to ingreedy-data
Weave data together from each of the root, McCance and FDC ingreedy-data JSON files
Build a search index and perform query-based ingredient nutrition matching
Update the hierarchy.json output document to include nutritional information
How have the changes been tested?
Manual testing and inspection
List any issues that this change relates to
Relates to the RecipeRadar Q3 2020 roadmap.
Describe the reason for these changes and the problem that they solve
This changeset adds a reference to the
ingreedy-data
dataset, which contains consolidated nutritional information from multiple sources including the UK CoFID (aka 'McCance') and the US FoodData Central databases.In order to perform matching between the RecipeRadar dataset and
ingreedy-data
, we use thenormalized_name
field from the latter's consolidated JSON format and build a search index using the names as documents. For each named ingredient in RecipeRadar, a query is performed on the search index and the best-matching result is selected.If no matches are found and the RecipeRadar ingredient has a 'parent' (i.e.
tofu
is the parent offirm tofu
), then nutritional information from the parent element is used as a fallback where present.Matching accuracy hasn't yet been scrutinized or quantified and this algorithm will likely require further development and improvements.
Briefly summarize the changes
ingreedy-data
ingreedy-data
JSON fileshierarchy.json
output document to include nutritional informationHow have the changes been tested?
List any issues that this change relates to Relates to the RecipeRadar Q3 2020 roadmap.
cc @tomwhite