openculinary / knowledge-graph

The RecipeRadar knowledge graph stores and provides access to recipe and ingredient relationship information.
GNU Affero General Public License v3.0
10 stars 0 forks source link

Canonicalization refactor #31

Closed jayaddison closed 4 years ago

jayaddison commented 4 years ago

Describe the reason for these changes and the problem that they solve

There had been a decent amount of tech debt building in between the web endpoint, product models, and core search code in the knowledge-graph.

Recent changes to canonicalization in order to support query-time evaluation of synonyms had highlighted this again, and it seems worthwhile to pay down some of the debt with this changeset.

Briefly summarize the changes

  1. Canonicalization logic is moved into the core Product class
  2. Consistent ProducyAnalyzer and ProductStemmer subclasses are introduced for indexing and query-time document transformation
  3. Product display name, content generation and ID generation are refactored

How have the changes been tested?

  1. Unit test coverage is provided
  2. The product hierarchy.json has been regenerated locally via pipenv run python -m scripts.hierarchy --update

List any issues that this change relates to Relates to #29 Relates to #30