openculinary / knowledge-graph

The RecipeRadar knowledge graph stores and provides access to recipe and ingredient relationship information.
GNU Affero General Public License v3.0
10 stars 0 forks source link

Ingredient string 'clove garlic' should output parsed product 'garlic' instead of 'clove' #53

Closed jayaddison closed 4 years ago

jayaddison commented 4 years ago

Describe the bug Currently when an ingredient string uses the descriptive phrase ... clove garlic, the product in the ingredient is identified as clove instead of the expected result garlic.

This is due to the two products both ranking equally as single-term matches for the input string, without any tie-breaker to distinguish a 'winner' between the two terms.

We should use the term frequency for each product as a tie-breaker in situations like this; garlic has a much higher term frequency than clove, and in the case of ambiguity it is more likely to be the intended result.

To Reproduce Steps to reproduce the behavior:

  1. Perform an ingredient parsing query for the string 1 clove garlic
  2. Observe that the parsed product name is clove
$ curl -H 'Host: knowledge-graph' -XPOST localhost:30080/ingredients/query --data 'descriptions[]=1 clove garlic' | jq
...
{
  "results": {
    "1 clove garlic": {
...
        "product": "clove",
...
  }
}

Expected behavior After parsing, the product garlic should be identified.

jayaddison commented 4 years ago

Resolved via openculinary/hashedixsearch#25