openfoodfacts / openfoodfacts-server

Open Food Facts database, API server and web interface - 🐪🦋 Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor
http://openfoodfacts.github.io/openfoodfacts-server/
GNU Affero General Public License v3.0
652 stars 381 forks source link

Generate a json with extended synonym as we build taxonomies #10742

Open alexgarel opened 1 month ago

alexgarel commented 1 month ago

Problem

As we build taxonomies, we export a json with taxonomy. This is very useful for third party applications that deals with open food facts data, and wants to run some analysis based on taxonomies. For example, this is used by robotoff and search-a-licious.

As reported on https://wiki.openfoodfacts.org/Taxonomy_access, we currently have two versions: one with only synonyms, and one with additional properties. But there is no export with extended synonyms (were we replace synonyms by synonyms), whereas it would be very useful to search-a-licious.

Proposed solution

Export a .extended.json, which contains a extended_synonyms property with extended synonyms.

My guess is to avoid putting properties in it (to avoid file being too massive), one can download .extended and .full version if needed, the merge is easy to do.

Code pointers

It happens in Tags.pm, in build_taxonomies build_tags_taxonomy (at the end)

alexgarel commented 1 week ago

https://static.openfoodfacts.org/data/taxonomies/categories.extended.json is not there @stephanegigandet even if the other files were generated Oct 3 in /srv/off/html/data/taxonomies