Various entity types are handled during construction of the ingredient hierarchy:
Products ingested from the api service
Stopwords calculated from the indexed set of products
Hierarchy nodes derived from the product graph
This changeset introduces optional filesystem-backed storage of these entities, primarily in support of serving the hierarchy via a web service once it is generated, but also to aid development iteration speed. Absence of files (i.e. when the service is first starting, or via manual removal in development environments) causes the cache to be regenerated at runtime.
This changeset also mixes in a few small items handled during refactoring:
Output of the hierarchy is now in a JSON format ready for consumption by the web service
ID generation no longer relies on hashing
Python generators are used more extensively to prevent blocking during loading and indexing
Pre-parsing via ingreedy-py is re-introduced; this aims to drop irrelevant tokens earlier on larger/more noisy datasets
Various entity types are handled during construction of the ingredient hierarchy:
api
serviceThis changeset introduces optional filesystem-backed storage of these entities, primarily in support of serving the hierarchy via a web service once it is generated, but also to aid development iteration speed. Absence of files (i.e. when the service is first starting, or via manual removal in development environments) causes the cache to be regenerated at runtime.
This changeset also mixes in a few small items handled during refactoring:
ingreedy-py
is re-introduced; this aims to drop irrelevant tokens earlier on larger/more noisy datasets