openfoodfacts / folksonomy_api

A light REST API designed for Open Food Facts folksonomy engine
https://wiki.openfoodfacts.org/Folksonomy_Engine
GNU Affero General Public License v3.0
11 stars 7 forks source link

Properties related to non-food products should not be returned for food products (and vice-versa) #128

Open CharlesNepote opened 1 year ago

CharlesNepote commented 1 year ago

There are now more than 75 properties created by users (2023-01).

Some of them are dedicated to food products.

Some of them could be used for many, or even nearly any, kind of products:

Some of them could be used for different kind of products including food:

And some of them are clearly not relevant for food products and even sometimes exclusively dedicated to some categories:

These properties are visible at three places:

I think that deploying Folksonomy Engine to sister sites (see #62) will quickly lead to anarchy if we don't address this topic.

CharlesNepote commented 1 year ago

Solution 1

The first solution is to use prefixes to identify which things the property applies to. There are no changes in the API. When suggesting properties, the UI can filter the relevant ones.

Case 1: the property is dedicated to food products. Eg. packaging:nutriscore:multiple.

Case 2: the property is dedicated to general products and beauty products, but not food products. Eg. flammable.

PROs:

CONs:

Solution 2

Create another DB which lists all properties and their targeted products. property food beauty general
packaging:nutriscore:multiple true
flammable true true
conservation:temperature true true true

PROs:

CONs:

Solution 3

Create a taxonomy for the properties. A taxonomy allows complex cases. Eg.:

<food_properties
en:packaging:nutriscore:multiple
fr:emballage:nutriscore:multiple
en:description:There are more than one Nutri-Score displayed on the packaging. The values should be the letters separated by commas.
en:example:a,b
xx:regexp_control:[a-e],[a-e](,[a-e])*

<beauty_properties
<general_properties
en:flammable

PROs:

CONs:

benbenben2 commented 1 year ago

Do you think that we could have option to add a column in the database for the sources (Open Food Facts, Open Beauty Facts, etc.)?

Like this, flammable entered in both Open Food Facts and Open Beauty Facts would lead to 2 rows in the database.

Then, we could either look at flammable for Open Food Facts or for Open Beauty Facts or for both.

alexgarel commented 1 year ago

I like the idea of @benbenben2 because it does not require more work. It just separates usages. Until a property is added to a product type, it won't be suggested.

And API call can decide to filter or not on database type.

CharlesNepote commented 1 year ago

@benbenben2 @alexgarel yes this solution is quite simple, it's a bit like solution 1, but with an extra field.

it does not require more work

It does:

Solution 1 seems more simpler: we only need to modify the front-end (+ perhaps move few wiki pages). But it has the drawback to use long names, which can be different in the database and in the front-end...

benbenben2 commented 1 year ago

Maybe I do not get the point of the folksonomy engine. Isn't it a tool literally for everyone? If we impose rules and conventions based on a documentation, isn't it a risk to make it not for everyone?

alexgarel commented 1 year ago

@benbenben2 this is the way openStreetMaps works !

(as we democratize Folksonomy Engine, we may also add some widgets for common properties)

CharlesNepote commented 1 year ago

@benbenben2 like in OpenStreetMap project, everyone can create his/her own properties, but the properties are much efficient if they are also used by other people. Documenting the properties is not mandatory, but the undocumented properties get less chance to be reused.

teolemon commented 1 week ago

I believe this might be down to category level, and might approach a dual statistical approach (this category is often used with those properties) and taxonomized approach (once we manage a way to generate associated properties at scale, possibly using LLMs - see: https://github.com/openfoodfacts/openfoodfacts-ai/issues/296)

teolemon commented 1 week ago

I don't see this as a strict blocker for the deployment.

alexgarel commented 1 week ago

Looking at it again, @CharlesNepote another way could be to keep solution 1 adding a prefix by product type (plus an eventual general one) but:

  1. add a prefix filter (multiple prefix possible) to the /keys API (very easy to do)
  2. abstract this prefix thing at the javascript and interface level (the user does not see the prefix, but a meaningful interface: prefix is not visible, and maybe you separate items between food and general ones (two part of the table), when you autocomplete you search on off+general prefix, but you display it as a property type)
teolemon commented 1 week ago

There are properties common to food and non food, thus I would really use ML/Statistics/categories