Closed jayaddison closed 2 years ago
Canonicalizations are no longer an issue; this bug was resolved as part of (and was a motivating factor for) openculinary/backend#54.
Based on re-running the repro query from the description (with one slight modification: product_id
-> product_name_id
), It does appear that the term chilli
continues to lack product
mappings in a number of cases -- that's a separate issue however, and not related to canonicalization/synonyms.
Describe the bug Among ingredient lines that are not correctly identified by the knowledge graph,
coriander
andred chillies
appear to be among the most frequent. This makes me think that something could be broken with our handling of product canonicalizations (implemented usingsynonym
support inhashedixsearch
).To Reproduce Ran a query on the
backend
PostgreSQL database:...there's a fair amount of noise and stopwords in there (
whole
,1
, ...), but also some easy ingredient names that should have been matched to products.Expected behavior Products with canonicalized names should be identified reliably.
Recommendation If synonyms are the cause of this, it may be worth writing up a brief design spec about how synonyms should behave in
hashedixsearch
. As far as I know, this wasn't clearly specified before an initial implementation was provided.