openfoodfacts / smooth-app

🤳🥫 The new Open Food Facts mobile application for Android and iOS, crafted with Flutter and Dart
https://world.openfoodfacts.org/open-food-facts-mobile-app?utm_source=off&utf_medium=web&utm_campaign=github-repo
Apache License 2.0
853 stars 282 forks source link

Implement new algorithm to compute how products match users food preferences -- being tested on OFF website #1894

Closed stephanegigandet closed 2 years ago

stephanegigandet commented 2 years ago

Problem

Determining if a product matches (and how well it matches) users food preferences is a complex topic (especially as some preferences can be "hard" requirements (e.g. absolutely no gluten or strict vegan diet) or just preferences (e.g. I prefer to eat vegetarian and organic products), and we often have incomplete product information (e.g. if we didn't recognize all ingredients, we can't strictly acertain that a product is vegan or gluten-free).

There is a very long discussion about this problem here: #1060

Proposed solution

A proposed solution was introduced on the forum: https://forum.openfoodfacts.org/t/proposal-for-a-new-personalized-experience-in-the-open-food-facts-app-and-website/71

The proposed match status are:

// - very_good_match score >= 75 // - good_match score >= 50 // - poor_match score < 50 // - unknown_match at least one mandatory attribute is unknown, or unknown attributes weight more than 50% of the score // - may_not_match at least one mandatory attribute score is <= 50 (e.g. may contain traces of an allergen) // - does_not_match at least one mandatory attribute score is <= 10 (e.g. contains an allergen, is not vegan)

Javascript code that implements the algorithm: https://github.com/openfoodfacts/openfoodfacts-server/blob/main/html/js/product-search.js

Mockups

This matching system is now live on the OFF website.

image

Demo: https://www.loom.com/share/33ca5a6179bd49b29ad2404a1b1af2ff

This system is being tested on the website, to check that it gives relevant results. Once we have verified that it works, we should replicate it in Smoothie.

monsieurtanuki commented 2 years ago

@stephanegigandet My remarks:

monsieurtanuki commented 2 years ago

Btw there's a value missing in the screenshot: "Does not match"

monsieurtanuki commented 2 years ago

@stephanegigandet I think there's a flaw in the algorithm: you loop on the product attributes, and then see if they match with the preferences. That should be the other way around, shouldn't it?

Like, some product contains meat but doesn't say anything about that, and I'm a strict "mandatory" vegetarian. As a vegetarian I should be warned that some of my mandatory choices are potentially not fulfilled.

stephanegigandet commented 2 years ago

Hi @monsieurtanuki

  • Foreground: white
  • "Very good match 88/100": #009458
  • "Good match 70/100": #5da92d
  • "Poor match 48/100": #ce8c26
  • "Unknown match 45/100": #4f4f4f
  • "May not match 43/100": #e77128

The colors are a bit off, did you take them from the screenshot? I think GitHub or my browser changed them slightly.

I used the same colors as the A to E grades:

$grade-a-color: #219653; = Very good match $grade-b-color: #60ac0e; = Good match $grade-c-color: #c88f01; = Poor match $grade-d-color: #e07312; = May not match $grade-e-color: #eb5757; = Does not match --> Score is 0

Those are the colors from the Figma design, but it looks like the ones we have in Smoothie are older.

and a new color for "Unknown match" : #888888

  • if needed, we can already code the new algorithm in smoothie and make it available with dev mode
  • I don't know how relevant it would be to code it in off-dart instead, or perhaps with a distinct version number (e.g. MatchedProductV2 and MatchedProductStatusV2

It would be great to do it in openfoodfacts-dart eventually.

stephanegigandet commented 2 years ago

@stephanegigandet I think there's a flaw in the algorithm: you loop on the product attributes, and then see if they match with the preferences. That should be the other way around, shouldn't it?

All products have all attributes computed (with the exception of the Eco-Score and forest footprint that are not available in all countries), so we can loop on all attributes and check their setting.

I think it's easier to do it like this because the API returns attributes in attribute groups.

For the Dart implementation, I think it's best to mimick exactly the Javascript implementation, it will be much easier to keep them in synch. If one does things in the other way around, then it becomes much more difficult to compare the implementations.

Like, some product contains meat but doesn't say anything about that, and I'm a strict "mandatory" vegetarian. As a vegetarian I should be warned that some of my mandatory choices are potentially not fulfilled.

Yes, definitely. That's what happens on the website. e.g. on https://world.openfoodfacts.org , if I set only one preference, "vegetarian" as "mandatory", and search for "sausages", I get this:

image

Very good match: we are sure it's vegetarian May not match: it contains something that might not be vegetarian (in this case, it's "natural flavouring"). Unknown match: there are some ingredients we did not recognize, so we can't be sure it's vegetarian Does not match: we detected a non-vegetarian ingredient

monsieurtanuki commented 2 years ago

Thank you @stephanegigandet for your comments.

All products have all attributes computed (with the exception of the Eco-Score and forest footprint that are not available in all countries), so we can loop on all attributes and check their setting. I think it's easier to do it like this because the API returns attributes in attribute groups.

It's easier but not safe: you know "All products have all attributes computed" but I'm rather skeptical (not about the current state of the devs but about future bad luck events or additional optional attributes). If you don't mind I'll keep the logic coded in js but I'll add some checks in the end, which will probably focus only on "mandatory" preferences.

For the Dart implementation, I think it's best to mimick exactly the Javascript implementation, it will be much easier to keep them in synch. If one does things in the other way around, then it becomes much more difficult to compare the implementations.

Sure, my goal was not to code differently in dart, my goal was to warn about a possible flaw in the original js algorithm.

Working on it https://github.com/openfoodfacts/openfoodfacts-dart/issues/466

monsieurtanuki commented 2 years ago

@stephanegigandet Is https://github.com/openfoodfacts/openfoodfacts-server/blob/main/html/js/product-search.js the code that is actually running on the website?

I'm a bit puzzled because I don't see how you can get a non-zero score with only a mandatory attribute.

Correct me if (when) I'm wrong:

monsieurtanuki commented 2 years ago

@stephanegigandet Aha! I've just compared the "github" code with the current web code, and the only code difference is that on the website the preferences_factors is 2 (and not 0) for mandatory, which fixes the bug I mentioned in my previous post.

From what I've just understood, let me explain the meaning of the values of preferences_factors regarding score computing:

Beyond the computation of the score, mandatory attributes are relevant only regarding "does not match" and "may not match" statuses - and that's their difference with "very important" attributes (as they have the same double weight for score computing):

stephanegigandet commented 2 years ago

Hi @monsieurtanuki , you are right, I made changes yesterday following some feedback on Slack, as indeed it was incorrect to have 0 weight for the mandatory items.

From what I've just understood, let me explain the meaning of the values of preferences_factors regarding score computing:

  • not_important : 0 // we just ignore not important attributes
  • important: 1 // we give to important attributes the standard weight of 1
  • very_important, mandatory: 2 // we give to very important and mandatory attributes a double weight

Beyond the computation of the score, mandatory attributes are relevant only regarding "does not match" and "may not match" statuses - and that's their difference with "very important" attributes (as they have the same double weight for score computing):

  • if a mandatory attribute score is very bad, we flag the product as "does not match"
  • if a mandatory attribute score is just bad, we flag the product as "may not match"

Yes, this is all correct. With the addition that if the setting is "does not match", we set the final score to 0.

monsieurtanuki commented 2 years ago

Thank you @stephanegigandet for your answer. I'm about to PR on off-dart.

Some additional potential bug: in your code, when the attribute is mandatory and the status is unknown, you set a variable to "status=unknown" but you don't reuse it later. For the record you do similar tests with "very bad score => does not match" and "bad score => may not match", and use the result later with product.attributes_for_status.

stephanegigandet commented 2 years ago

For reference, yesterday's change for setting mandatory to 2 (unmerged yet, but deployed on the website) : https://github.com/openfoodfacts/openfoodfacts-server/pull/6797

stephanegigandet commented 2 years ago

Some additional potential bug: in your code, when the attribute is mandatory and the status is unknown, you set a variable to "status=unknown" but you don't reuse it later. For the record you do similar tests with "very bad score => does not match" and "bad score => may not match", and use the result later with product.attributes_for_status.

Where is that exactly?

In this code?

`{

                var attribute_factor = preferences_factors[attribute_preference];
                sum_of_factors += attribute_factor;

                if (attribute.status === "unknown") {

                    sum_of_factors_for_unknown_attributes += attribute_factor;

                    // If the attribute is mandatory and the attribute status is unknown
                    // then mark the product status unknown

                    if (attribute_preference === "mandatory") {
                        match_status_for_attribute = "unknown_match";
                    }
                }
                else {

                    debug += attribute.id + " " + attribute_preference + " - match: " + attribute.match + "\n";

                    score += attribute.match * attribute_factor;

                    if (attribute_preference === "mandatory") {
                        if (attribute.match <= 10) {
                            // Mandatory attribute with a very bad score (e.g. contains an allergen) -> status: does not match
                            match_status_for_attribute = "does_not_match";
                        }
                        // Mandatory attribute with a bad score (e.g. may contain traces of an allergen) -> status: may not match
                        else if (attribute.match <= 50) {
                            match_status_for_attribute = "may_not_match";
                        }
                    }
                }

                if (!(match_status_for_attribute in product.attributes_for_status)) {
                    product.attributes_for_status[match_status_for_attribute] = [];
                }
                product.attributes_for_status[match_status_for_attribute].push(attribute);

                product.match_attributes[attribute_preference].push(attribute);
            }`
monsieurtanuki commented 2 years ago

@stephanegigandet Indeed:

// If the attribute is mandatory and the attribute status is unknown
// then mark the product status unknown
if (attribute_preference === "mandatory") {
  match_status_for_attribute = "unknown_match";
}

You don't reuse it later like you do there:

// If one of the attributes does not match, the product does not match
if ("does_not_match" in product.attributes_for_status) {
// Set score to 0 for products that do not match
  score = "0";
  product.match_status = "does_not_match";
}
else if ("may_not_match" in product.attributes_for_status) {
  product.match_status = "may_not_match";
}

I expected something like that:

else if ("unknown_match" in product.attributes_for_status) {
  product.match_status = "unknown_match";
}
stephanegigandet commented 2 years ago

Thank you very much @monsieurtanuki , it is a bug indeed, I'm fixing it.

monsieurtanuki commented 2 years ago

If someone feels like coding that tomorrow Saturday 21st, be my guest:

monsieurtanuki commented 2 years ago

Working on it today.