📶 🤳 Offline scanning (Tracker)

teolemon commented 3 years ago

What

### Tasks
- [ ] openfoodfacts/openfoodfacts-server#6988
- [ ] https://github.com/openfoodfacts/smooth-app/issues/4067
- [ ] https://github.com/openfoodfacts/smooth-app/issues/4066
- [x] Add a local product cache
- [ ] #2452
- [ ] #2450
- [ ] openfoodfacts/smooth-app#2446
- [ ] openfoodfacts/smooth-app#2444
- [x] openfoodfacts/smooth-app#2445
- [x] openfoodfacts/smooth-app#2447
- [x] #2451
- [ ] https://github.com/openfoodfacts/smooth-app/issues/4025

Why

No network or slow in supermarkets
Add offline scanning with a reduced local copy of the database so that users can scan offline.
This feature has been documented - Design document
Part of
openfoodfacts/smooth-app#526

monsieurtanuki commented 3 years ago

@teolemon I'm interested!

teolemon commented 3 years ago

👋 @monsieurtanuki The full minimized DB is located at https://world.openfoodfacts.org/data/offline/en.openfoodfacts.org.products.small.csv.zip The minimal implementation we've done so far on the classic iPhone app is to unzip it, load it into the database, and show the values stored, and update them as we receive a value from the live API. We re-download the file monthly and update the local db.

And that's it. Smoothie is a little more complex with customized rankings and all, but I think we should start with something simple we can iterate on.

If you'd like to have an idea of additional complexities and sophistications, there's a (somewhat confusing) Google Doc: https://docs.google.com/document/d/1URdZL2pxIP-lCkM9ZzUVMnoSr2o-d3741TyWimzrJI8/edit#

There's also the Swift code for the iOS implem if you'd like to have a look: https://github.com/openfoodfacts/openfoodfacts-ios/commit/20723dab571fd0c155197093190d293886f876bd

teolemon commented 3 years ago

also, wanna join the channel on slack ? https://slack.openfoodfacts.org

monsieurtanuki commented 3 years ago

(for the moment I have no access to the google docs file)

The csv file is about 80Mb big, has 1,555,491 lines plus a header, and the header is

code
product_name
quantity
brands
nutrition_grade_fr
nova_group
ecoscore_grade

Some stats (with a Unix command like cat en.openfoodfacts.org.products.small.csv | awk -F\t '{print $7}' | sort | uniq -c)...

ecoscore_grade is never populated
nutrition_grade_fr is never populated
nova_group:
- 977,769 lines with no value
- 61,660 lines with value 1.0
- 11,029 lines with value 2.0
- 119,693 lines with value 3.0
- 385,340 lines with value 4.0

That means that there are only 577,722 lines with an attached value.

In addition to that, I saw tons of products in non-latin alphabets.

For the moment I don't see the use case...

teolemon commented 3 years ago

aha… ecoscore: normal: not live yet. nutrition_grade_fr >> that's a bug, congratulations for finding it 🎉 :-)

teolemon commented 3 years ago

Non latin alphabets: that's because that's the whole db for all countries

teolemon commented 3 years ago

I've made a fix server side @monsieurtanuki

[ ] https://github.com/openfoodfacts/openfoodfacts-server/pull/4649

monsieurtanuki commented 3 years ago

Non latin alphabets: that's because that's the whole db for all countries

That's my point about not understanding the use-case: what's the point of pre-downloading the data of all countries? (disclaimer: I have a smartphone with limited memory and disk space)

What I had in mind is: I'm in a supermarket and I want to check the different scores of - say - breakfast cereals, but I have a bad internet connection. Why would I think about downloading in advance data about foods sold in Russia or Japan? There's no "just in case" argumentation. And my personal eco behavior does not find it relevant either. Beyond the fact that most of the time (as mentioned in an early comment) there's no actual data extracted.

What may make sense is to select food categories and countries, and to focus on that. Maybe automatically: I scan a Kellog's in France, therefore there's a good chance I'm interested both in breakfast cereals and products sold in France, and that's the range of foods we could cache and refresh periodically.

teolemon commented 3 years ago

It was not a just in case choice, but a speed one:

The absence of division by country (or by product popularity) was to drastically simplify initial implementation (by removing the need for anything complex server side and client side).
Unzipped, the file is more or less 100mb, which sounded like a good compromise until we could add sophistication to save space (generally we care about phone space, the Android app is just 5mb for instance)

I've shared the design document, where there's a discussion on how to implement properly all of that.

teolemon commented 2 years ago

@monsieurtanuki @stephanegigandet says we should start using this country-specific route as you suggested: https://fr.openfoodfacts.org/api/v2/search?fields=code,product_name&page_size=1000 And he will ensure the transition to 10K products will be transparent, once there's something working

monsieurtanuki commented 2 years ago

About https://fr.openfoodfacts.org/api/v2/search?fields=code,product_name&page_size=1000:

1000 products (page size is 1000, page number is 1)
the file size is 67850 bytes
it's basically a list of {barcode, name}, e.g. {"code":"5010477348357","product_name":"Country Crisp 4 noix"}
in total there are 871104 products
- the total file / download/ database size would then be around 60Mb
- we need to be very lucky when downloading the data page by page given the download size and the number of iterations - probable side-effects for products once in page X+1 and later in page X
- we would probably need to use SQFlite again, because even in "lazy" mode hive takes time at init in proportion to the number of records, and would (to be double-checked) pre-load at least the keys (with EAN13 that means around 11Mb)

I still don't understand the use-case: downloading 60Mb of world data for a limited added value (barcode => name). And we need to refresh it altogether time after time.

@teolemon As I don't understand the purpose I think you should find someone else for this issue (e.g. someone who understands it) - that's why I removed my assignment to this issue about a year ago. But I can still answer questions, write comments and make suggestions (for instance about alternate solutions).

teolemon commented 2 years ago

Speed up or enable (in worst case scenario) scan in supermarkets. @jasmeet0817 told me that he started using the app in real life, after the update for panel expansion. The highlights of his experience were scanning overheating after some time, and difficulty scanning due to network issues.

jasmeet0817 commented 2 years ago

About https://fr.openfoodfacts.org/api/v2/search?fields=code,product_name&page_size=1000:

1000 products (page size is 1000, page number is 1)

the file size is 67850 bytes

it's basically a list of {barcode, name}, e.g. {"code":"5010477348357","product_name":"Country Crisp 4 noix"}

in total there are 871104 products

the total file / download/ database size would then be around 60Mb

we need to be very lucky when downloading the data page by page given the download size and the number of iterations - probable side-effects for products once in page X+1 and later in page X

we would probably need to use SQFlite again, because even in "lazy" mode hive takes time at init in proportion to the number of records, and would (to be double-checked) pre-load at least the keys (with EAN13 that means around 11Mb)

I still don't understand the use-case: downloading 60Mb of world data for a limited added value (barcode => name). And we need to refresh it altogether time after time.

@teolemon what's the end goal? To pre-download just (barcodes => name) or the whole product. If it's just names, I agree with @monsieurtanuki. If it's the whole product it's going to take a lot of memory, and the database is always increasing. We would also need a syncing mechanism.

Yes I did have issues where I scanned products and then I was just waiting for it to load, but my phone was also very heated up so I'm not really sure if the root cause was the heating up or network issues. Even if it was network issues, I would try to do a thorough analysis before working on this.

Since this is a non trivial task I would first validate if we really need this, I would add some metrics on how often product fetch calls timeout and if it's higher than a permissible threshold I would go for this feature. If needed, I would also suggest compressing the response payload as much as possible.

@teolemon As I don't understand the purpose I think you should find someone else for this issue (e.g. someone who understands it) - that's why I removed my assignment to this issue about a year ago. But I can still answer questions, write comments and make suggestions (for instance about alternate solutions).

monsieurtanuki commented 2 years ago

@teolemon There are separate problems here:

why did @jasmeet0817's smartphone heat that much? Was it the network, was it the camera, was it that there were many pending internet connections, was it something else? Please run tests (in profiler mode?) to see what is causing the heating.
the quantity of data for each product is too large (10Kb). For a low connectivity use case, I guess we could download the product in 2 steps - a first very quick step where we download a minimum set of fields (e.g. name, brand), or a compact mode, enough to be displayed, and then a (cancelable) full-size product download

cf. https://world.openfoodfacts.org/api/v0/product/093270067481501.json and its 19558 bytes ("all" fields here)

a good product for you, in json

```json { "code": "093270067481501", "product": { "_id": "093270067481501", "_keywords": ["fact", "open", "good", "beverage", "fruit", "you", "plant-based", "based", "eu", "organic", "meal", "trade", "soup", "product", "for", "and", "food", "vegetable", "fair", "france"], "added_countries_tags": [], "additives_n": 0, "additives_old_n": 0, "additives_old_tags": [], "additives_original_tags": [], "additives_tags": [], "allergens": "", "allergens_from_ingredients": "", "allergens_from_user": "(en) ", "allergens_hierarchy": [], "allergens_lc": "en", "allergens_tags": [], "amino_acids_tags": [], "brands": "Open Food Facts", "brands_imported": "Open Food Facts", "brands_tags": ["open-food-facts"], "carbon_footprint_from_known_ingredients_debug": "en:butternut-squash 50% x 0.3 = 15 g - en:carrot 10% x 0.3 = 3 g - ", "carbon_footprint_percent_of_known_ingredients": 60, "categories": "Plant-based foods and beverages,Plant-based foods,Fruits and vegetables based foods,Meals,Soups,Vegetable soups", "categories_hierarchy": ["en:plant-based-foods-and-beverages", "en:plant-based-foods", "en:fruits-and-vegetables-based-foods", "en:meals", "en:soups", "en:vegetable-soups"], "categories_imported": "Plant-based foods and beverages, Plant-based foods, Fruits and vegetables based foods, Meals, Soups, Vegetable soups", "categories_lc": "en", "categories_properties": { "agribalyse_proxy_food_code:en": "25903", "ciqual_food_code:en": "25969" }, "categories_properties_tags": ["all-products", "categories-known", "agribalyse-food-code-unknown", "agribalyse-proxy-food-code-25903", "agribalyse-proxy-food-code-known", "ciqual-food-code-25969", "ciqual-food-code-known", "agribalyse-known", "agribalyse-25903"], "categories_tags": ["en:plant-based-foods-and-beverages", "en:plant-based-foods", "en:fruits-and-vegetables-based-foods", "en:meals", "en:soups", "en:vegetable-soups"], "checkers_tags": [], "cities_tags": ["saint-maur-des-fosses-val-de-marne-france"], "code": "093270067481501", "codes_tags": ["code-15", "093270067481xxx", "09327006748xxxx", "0932700674xxxxx", "093270067xxxxxx", "09327006xxxxxxx", "0932700xxxxxxxx", "093270xxxxxxxxx", "09327xxxxxxxxxx", "0932xxxxxxxxxxx", "093xxxxxxxxxxxx", "09xxxxxxxxxxxxx", "0xxxxxxxxxxxxxx"], "compared_to_category": "en:vegetable-soups", "complete": 0, "completeness": 0.8, "correctors_tags": ["org-openfoodfacts", "stephane"], "countries": "France", "countries_hierarchy": ["en:france"], "countries_imported": "France", "countries_lc": "en", "countries_tags": ["en:france"], "created_t": 1641562431, "creator": "org-openfoodfacts", "data_quality_bugs_tags": [], "data_quality_errors_tags": [], "data_quality_info_tags": ["en:packaging-data-incomplete", "en:ingredients-percent-analysis-ok", "en:carbon-footprint-from-known-ingredients-but-not-from-meat-or-fish"], "data_quality_tags": ["en:packaging-data-incomplete", "en:ingredients-percent-analysis-ok", "en:carbon-footprint-from-known-ingredients-but-not-from-meat-or-fish"], "data_quality_warnings_tags": [], "data_sources": "Producers, Producer - openfoodfacts", "data_sources_imported": "Producers, Producer - openfoodfacts, Producers, Producer - openfoodfacts", "data_sources_tags": ["producers", "producer-openfoodfacts"], "debug_param_sorted_langs": ["en", "fr"], "ecoscore_data": { "adjustments": { "origins_of_ingredients": { "aggregated_origins": [{ "origin": "en:france", "percent": 100 }], "epi_score": 93, "epi_value": 4, "origins_from_origins_field": ["en:france"], "transportation_scores": { "ad": 57, "al": 0, "at": 38, "ax": 67, "ba": 14, "be": 85, "bg": 21, "ch": 69, "cy": 40, "cz": 48, "de": 61, "dk": 39, "dz": 45, "ee": 71, "eg": 35, "es": 37, "fi": 69, "fo": 62, "fr": 100, "gg": 78, "gi": 4, "gr": 49, "hr": 30, "hu": 26, "ie": 47, "il": 34, "im": 50, "is": 53, "it": 47, "je": 76, "lb": 39, "li": 64, "lt": 63, "lu": 82, "lv": 71, "ly": 56, "ma": 60, "mc": 52, "md": 29, "me": 37, "mk": 29, "mt": 57, "nl": 77, "no": 20, "pl": 25, "ps": 42, "pt": 13, "ro": 31, "rs": 7, "se": 15, "si": 38, "sj": 53, "sk": 24, "sm": 40, "sy": 26, "tn": 9, "tr": 7, "ua": 40, "uk": 68, "us": 0, "va": 29, "world": 0, "xk": 28 }, "transportation_values": { "ad": 9, "al": 0, "at": 6, "ax": 10, "ba": 2, "be": 13, "bg": 3, "ch": 10, "cy": 6, "cz": 7, "de": 9, "dk": 6, "dz": 7, "ee": 11, "eg": 5, "es": 6, "fi": 10, "fo": 9, "fr": 15, "gg": 12, "gi": 1, "gr": 7, "hr": 5, "hu": 4, "ie": 7, "il": 5, "im": 8, "is": 8, "it": 7, "je": 11, "lb": 6, "li": 10, "lt": 9, "lu": 12, "lv": 11, "ly": 8, "ma": 9, "mc": 8, "md": 4, "me": 6, "mk": 4, "mt": 9, "nl": 12, "no": 3, "pl": 4, "ps": 6, "pt": 2, "ro": 5, "rs": 1, "se": 2, "si": 6, "sj": 8, "sk": 4, "sm": 6, "sy": 4, "tn": 1, "tr": 1, "ua": 6, "uk": 10, "us": 0, "va": 4, "world": 0, "xk": 4 }, "values": { "ad": 13, "al": 4, "at": 10, "ax": 14, "ba": 6, "be": 17, "bg": 7, "ch": 14, "cy": 10, "cz": 11, "de": 13, "dk": 10, "dz": 11, "ee": 15, "eg": 9, "es": 10, "fi": 14, "fo": 13, "fr": 19, "gg": 16, "gi": 5, "gr": 11, "hr": 9, "hu": 8, "ie": 11, "il": 9, "im": 12, "is": 12, "it": 11, "je": 15, "lb": 10, "li": 14, "lt": 13, "lu": 16, "lv": 15, "ly": 12, "ma": 13, "mc": 12, "md": 8, "me": 10, "mk": 8, "mt": 13, "nl": 16, "no": 7, "pl": 8, "ps": 10, "pt": 6, "ro": 9, "rs": 5, "se": 6, "si": 10, "sj": 12, "sk": 8, "sm": 10, "sy": 8, "tn": 5, "tr": 5, "ua": 10, "uk": 14, "us": 4, "va": 8, "world": 4, "xk": 8 } }, "packaging": { "non_recyclable_and_non_biodegradable_materials": 0, "packagings": [{ "ecoscore_material_score": "81", "ecoscore_shape_ratio": "1", "material": "en:glass", "recycling": "en:recycle", "shape": "en:bottle" }, { "ecoscore_material_score": "76", "ecoscore_shape_ratio": "0.1", "material": "en:steel", "recycling": "en:recycle", "shape": "en:bottle-cap" }], "score": 78.6, "value": -2 }, "production_system": { "labels": ["en:eu-organic"], "value": 15 }, "threatened_species": {} }, "agribalyse": { "agribalyse_proxy_food_code": "25903", "co2_agriculture": "0.099163319", "co2_consumption": "0.0079267752", "co2_distribution": "0.025340163", "co2_packaging": "0.098786634", "co2_processing": "0.13521664", "co2_total": "0.49805172", "co2_transportation": "0.1316182", "code": "25903", "dqr": "2.42", "ef_agriculture": "0.023834065", "ef_consumption": "0.004012423", "ef_distribution": "0.0095369029", "ef_packaging": "0.014974093", "ef_processing": "0.030870892", "ef_total": "0.093465939", "ef_transportation": "0.010237564", "is_beverage": 0, "name_en": "Soup, mixed vegetables, prepacked, to be reheated", "name_fr": "Soupe aux légumes variés, préemballée à réchauffer", "score": 97 }, "grade": "a", "grades": { "ad": "a", "al": "a", "at": "a", "ax": "a", "ba": "a", "be": "a", "bg": "a", "ch": "a", "cy": "a", "cz": "a", "de": "a", "dk": "a", "dz": "a", "ee": "a", "eg": "a", "es": "a", "fi": "a", "fo": "a", "fr": "a", "gg": "a", "gi": "a", "gr": "a", "hr": "a", "hu": "a", "ie": "a", "il": "a", "im": "a", "is": "a", "it": "a", "je": "a", "lb": "a", "li": "a", "lt": "a", "lu": "a", "lv": "a", "ly": "a", "ma": "a", "mc": "a", "md": "a", "me": "a", "mk": "a", "mt": "a", "nl": "a", "no": "a", "pl": "a", "ps": "a", "pt": "a", "ro": "a", "rs": "a", "se": "a", "si": "a", "sj": "a", "sk": "a", "sm": "a", "sy": "a", "tn": "a", "tr": "a", "ua": "a", "uk": "a", "us": "a", "va": "a", "world": "a", "xk": "a" }, "score": 122, "scores": { "ad": 122, "al": 114, "at": 120, "ax": 122, "ba": 116, "be": 122, "bg": 117, "ch": 122, "cy": 120, "cz": 121, "de": 122, "dk": 120, "dz": 121, "ee": 122, "eg": 119, "es": 120, "fi": 122, "fo": 122, "fr": 122, "gg": 122, "gi": 115, "gr": 121, "hr": 119, "hu": 118, "ie": 121, "il": 119, "im": 122, "is": 122, "it": 121, "je": 122, "lb": 120, "li": 122, "lt": 122, "lu": 122, "lv": 122, "ly": 122, "ma": 122, "mc": 122, "md": 118, "me": 120, "mk": 118, "mt": 122, "nl": 122, "no": 117, "pl": 118, "ps": 120, "pt": 116, "ro": 119, "rs": 115, "se": 116, "si": 120, "sj": 122, "sk": 118, "sm": 120, "sy": 118, "tn": 115, "tr": 115, "ua": 120, "uk": 122, "us": 114, "va": 118, "world": 114, "xk": 118 }, "status": "known" }, "ecoscore_extended_data": { "impact": { "ef_single_score_log_stddev": 0.0800212307271874, "likeliest_impacts": { "Climate_change": 0.012534284630903, "EF_single_score": 0.00409189070065541 }, "likeliest_recipe": { "en:butternut_squash": 50.0548448998379, "en:carrot": 10.0109689799676, "en:leek": 2.20570054418209, "en:potato": 1.14589787403023, "en:salt": 0.392984356259901, "en:sweet_potato": 3.56236029441026, "en:water": 32.7369328509879 }, "mass_ratio_uncharacterized": 0, "uncharacterized_ingredients": { "impact": [], "nutrition": [] }, "uncharacterized_ingredients_mass_proportion": { "impact": 0, "nutrition": 0 }, "uncharacterized_ingredients_ratio": { "impact": 0, "nutrition": 0 }, "warnings": [] } }, "ecoscore_extended_data_version": "4", "ecoscore_grade": "a", "ecoscore_score": 122, "ecoscore_tags": ["a"], "editors_tags": ["stephane", "ecoscore-impact-estimator", "org-openfoodfacts"], "emb_codes": "FR 94.068.042 EC", "emb_codes_imported": "FR 94.068.042 CE", "emb_codes_orig": "EMB 53062, FR 94.068.042 EC", "emb_codes_tags": ["fr-94-068-042-ec"], "entry_dates_tags": ["2022-01-07", "2022-01", "2022"], "environment_impact_level": "", "environment_impact_level_tags": [], "expiration_date": "", "food_groups": "en:soups", "food_groups_tags": ["en:fruits-and-vegetables", "en:soups"], "generic_name": "", "generic_name_en": "", "generic_name_fr": "", "grades": {}, "id": "093270067481501", "informers_tags": ["org-openfoodfacts"], "ingredients": [{ "id": "en:butternut-squash", "percent": "50", "percent_estimate": "50", "percent_max": "50", "percent_min": "50", "rank": 1, "text": "Butternut squash", "vegan": "yes", "vegetarian": "yes" }, { "id": "en:water", "percent_estimate": 27.5, "percent_max": 45, "percent_min": "10", "rank": 2, "text": "water", "vegan": "yes", "vegetarian": "yes" }, { "id": "en:carrot", "percent": "10", "percent_estimate": "10", "percent_max": "10", "percent_min": "10", "rank": 3, "text": "carrots", "vegan": "yes", "vegetarian": "yes" }, { "id": "en:sweet-potato", "percent_estimate": 5, "percent_max": "10", "percent_min": 0, "rank": 4, "text": "sweet potato", "vegan": "yes", "vegetarian": "yes" }, { "id": "en:leek", "percent_estimate": 3.75, "percent_max": "10", "percent_min": 0, "rank": 5, "text": "leeks", "vegan": "yes", "vegetarian": "yes" }, { "id": "en:potato", "percent_estimate": 1.875, "percent_max": "10", "percent_min": 0, "rank": 6, "text": "potatoes", "vegan": "yes", "vegetarian": "yes" }, { "id": "en:salt", "percent_estimate": 1.875, "percent_max": 8, "percent_min": 0, "rank": 7, "text": "salt", "vegan": "yes", "vegetarian": "yes" }], "ingredients_analysis_tags": ["en:palm-oil-free", "en:vegan", "en:vegetarian"], "ingredients_from_or_that_may_be_from_palm_oil_n": 0, "ingredients_from_palm_oil_n": 0, "ingredients_from_palm_oil_tags": [], "ingredients_hierarchy": ["en:butternut-squash", "en:vegetable", "en:water", "en:carrot", "en:root-vegetable", "en:sweet-potato", "en:leek", "en:potato", "en:salt"], "ingredients_n": 7, "ingredients_n_tags": ["7", "1-10"], "ingredients_original_tags": ["en:butternut-squash", "en:water", "en:carrot", "en:sweet-potato", "en:leek", "en:potato", "en:salt"], "ingredients_percent_analysis": 1, "ingredients_tags": ["en:butternut-squash", "en:vegetable", "en:water", "en:carrot", "en:root-vegetable", "en:sweet-potato", "en:leek", "en:potato", "en:salt"], "ingredients_text": "Butternut squash 50%, water, carrots 10%, sweet potato, leeks, potatoes, salt.", "ingredients_text_en": "Butternut squash 50%, water, carrots 10%, sweet potato, leeks, potatoes, salt.", "ingredients_text_en_imported": "Butternut squash 50%, water, carrots 10%, sweet potato, leeks, potatoes, salt.", "ingredients_text_fr": "", "ingredients_text_with_allergens": "Butternut squash 50%, water, carrots 10%, sweet potato, leeks, potatoes, salt.", "ingredients_text_with_allergens_en": "Butternut squash 50%, water, carrots 10%, sweet potato, leeks, potatoes, salt.", "ingredients_text_with_allergens_fr": "", "ingredients_that_may_be_from_palm_oil_n": 0, "ingredients_that_may_be_from_palm_oil_tags": [], "ingredients_with_specified_percent_n": 2, "ingredients_with_specified_percent_sum": 60, "ingredients_with_unspecified_percent_n": 5, "ingredients_with_unspecified_percent_sum": 40, "interface_version_created": "import_csv_file - version 2019/09/17", "interface_version_modified": "20150316.jqm2", "known_ingredients_n": 9, "labels": "Organic,EU Organic,Fair trade", "labels_hierarchy": ["en:organic", "en:eu-organic", "en:fair-trade"], "labels_imported": "Organic, EU Organic, Fair trade", "labels_lc": "en", "labels_tags": ["en:organic", "en:eu-organic", "en:fair-trade"], "lang": "en", "lang_imported": "en", "languages": { "en:english": 4, "en:french": 1 }, "languages_codes": { "en": 4, "fr": 1 }, "languages_hierarchy": ["en:english", "en:french"], "languages_tags": ["en:english", "en:french", "en:2", "en:multilingual"], "last_edit_dates_tags": ["2022-01-07", "2022-01", "2022"], "last_editor": "ecoscore-impact-estimator", "last_modified_by": "ecoscore-impact-estimator", "last_modified_t": 1641563795, "lc": "en", "lc_imported": "en", "link": "", "main_countries_tags": [], "manufacturing_places": "", "manufacturing_places_tags": [], "minerals_tags": [], "misc_tags": ["en:nutrition-fruits-vegetables-nuts-estimate-from-ingredients", "en:nutrition-all-nutriscore-values-known", "en:nutriscore-computed", "en:ecoscore-extended-data-computed", "en:ecoscore-extended-data-version-4", "en:ecoscore-no-missing-data", "en:ecoscore-computed", "en:main-countries-new-product"], "no_nutrition_data": "", "nova_group": 3, "nova_group_debug": " -- categories/en:meals : 3", "nova_groups": "3", "nova_groups_tags": ["en:3-processed-foods"], "nucleotides_tags": [], "nutrient_levels": { "fat": "low", "salt": "moderate", "saturated-fat": "low", "sugars": "low" }, "nutrient_levels_tags": ["en:fat-in-low-quantity", "en:saturated-fat-in-low-quantity", "en:sugars-in-low-quantity", "en:salt-in-moderate-quantity"], "nutriments": { "carbohydrates": 4, "carbohydrates_100g": 4, "carbohydrates_unit": "g", "carbohydrates_value": 4, "carbon-footprint-from-known-ingredients_100g": 18, "carbon-footprint-from-known-ingredients_product": 36, "energy": 120, "energy-kj": 120, "energy-kj_100g": 120, "energy-kj_unit": "kJ", "energy-kj_value": 120, "energy_100g": 120, "energy_unit": "kJ", "energy_value": 120, "fat": 0.5, "fat_100g": 0.5, "fat_unit": "g", "fat_value": 0.5, "fiber": 3, "fiber_100g": 3, "fiber_unit": "g", "fiber_value": 3, "fruits-vegetables-nuts-estimate-from-ingredients_100g": 60, "fruits-vegetables-nuts-estimate-from-ingredients_serving": 60, "nova-group": 3, "nova-group_100g": 3, "nova-group_serving": 3, "nutrition-score-fr": -3, "nutrition-score-fr_100g": -3, "proteins": 1, "proteins_100g": 1, "proteins_unit": "g", "proteins_value": 1, "salt": 0.4, "salt_100g": 0.4, "salt_unit": "g", "salt_value": 0.4, "saturated-fat": 0.2, "saturated-fat_100g": 0.2, "saturated-fat_unit": "g", "saturated-fat_value": 0.2, "sodium": 0.16, "sodium_100g": 0.16, "sodium_unit": "g", "sodium_value": 0.16, "sugars": 0.5, "sugars_100g": 0.5, "sugars_unit": "g", "sugars_value": 0.5 }, "nutriscore_data": { "energy": 120, "energy_points": 0, "energy_value": 120, "fiber": 3, "fiber_points": 3, "fiber_value": 3, "fruits_vegetables_nuts_colza_walnut_olive_oils": 60, "fruits_vegetables_nuts_colza_walnut_olive_oils_points": 1, "fruits_vegetables_nuts_colza_walnut_olive_oils_value": 60, "grade": "a", "is_beverage": 0, "is_cheese": 0, "is_fat": 0, "is_water": 0, "negative_points": 1, "positive_points": 4, "proteins": 1, "proteins_points": 0, "proteins_value": 1, "saturated_fat": 0.2, "saturated_fat_points": 0, "saturated_fat_ratio": 40, "saturated_fat_ratio_points": 6, "saturated_fat_ratio_value": 40, "saturated_fat_value": 0.2, "score": -3, "sodium": 160, "sodium_points": 1, "sodium_value": 160, "sugars": 0.5, "sugars_points": 0, "sugars_value": 0.5 }, "nutriscore_grade": "a", "nutriscore_score": -3, "nutriscore_score_opposite": 3, "nutrition_data": "on", "nutrition_data_per": "100g", "nutrition_data_per_imported": "100g", "nutrition_data_prepared": "", "nutrition_data_prepared_per": "100g", "nutrition_data_prepared_per_imported": "100g", "nutrition_grade_fr": "a", "nutrition_grades": "a", "nutrition_grades_tags": ["a"], "nutrition_score_beverage": 0, "nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients": 1, "nutrition_score_warning_fruits_vegetables_nuts_estimate_from_ingredients_value": 60, "obsolete": "", "obsolete_imported": "0", "obsolete_since_date": "", "origin": "France", "origin_en": "France", "origin_en_imported": "France", "origin_fr": "", "origins": "France", "origins_hierarchy": ["en:france"], "origins_imported": "France", "origins_lc": "en", "origins_tags": ["en:france"], "other_nutritional_substances_tags": [], "owner": "org-openfoodfacts", "owner_fields": { "brands": 1641562722, "carbohydrates": 1641562722, "categories": 1641562722, "countries": 1641562722, "data_sources": 1641562722, "emb_codes": 1641562722, "energy-kj": 1641562722, "fat": 1641562722, "fiber": 1641562722, "ingredients_text_en": 1641562722, "labels": 1641562722, "lang": 1641562722, "lc": 1641562722, "nutrition_data_per": 1641562722, "nutrition_data_prepared_per": 1641562722, "obsolete": 1641562722, "origin_en": 1641562431, "origins": 1641562722, "packaging_text_en": 1641562722, "product_name_en": 1641562722, "product_name_fr": 1641562722, "proteins": 1641562722, "quantity": 1641562722, "salt": 1641562722, "saturated-fat": 1641562722, "sugars": 1641562722 }, "owners_tags": "org-openfoodfacts", "packaging": "", "packaging_tags": [], "packaging_text": "Glass bottle to recycle\r\nSteel cap to recycle", "packaging_text_en": "Glass bottle to recycle\r\nSteel cap to recycle", "packaging_text_en_imported": "Glass bottle to recycle\nSteel cap to recycle", "packaging_text_fr": "", "packagings": [{ "material": "en:glass", "recycling": "en:recycle", "shape": "en:bottle" }, { "material": "en:steel", "recycling": "en:recycle", "shape": "en:bottle-cap" }], "photographers_tags": [], "pnns_groups_1": "Fruits and vegetables", "pnns_groups_1_tags": ["fruits-and-vegetables", "known"], "pnns_groups_2": "Soups", "pnns_groups_2_tags": ["soups", "known"], "popularity_key": 0, "product_name": "A good product for you", "product_name_en": "A good product for you", "product_name_en_imported": "A good product for you", "product_name_fr": "Un bon produit pour vous", "product_name_fr_imported": "Un bon produit pour vous", "product_quantity": "200", "purchase_places": "", "purchase_places_tags": [], "quantity": "200 g", "quantity_imported": "200 g", "removed_countries_tags": [], "rev": 5, "scores": {}, "sources": [{ "fields": ["product_name_en", "product_name_fr", "packaging_text_en", "quantity", "brands", "categories", "labels", "emb_codes", "countries", "origin_en", "data_sources", "nutrition_data_per", "nutrition_data_prepared_per", "ingredients_text_en", "nutrients.carbohydrates_unit", "nutrients.carbohydrates_value", "nutrients.fat_unit", "nutrients.fat_value", "nutrients.fiber_unit", "nutrients.fiber_value", "nutrients.proteins_unit", "nutrients.proteins_value", "nutrients.salt_unit", "nutrients.salt_value", "nutrients.saturated-fat_unit", "nutrients.saturated-fat_value", "nutrients.sugars_unit", "nutrients.sugars_value"], "id": "org-openfoodfacts", "images": [], "import_t": 1641562431, "manufacturer": 1, "name": "openfoodfacts", "url": null }, { "fields": ["origins"], "id": "org-openfoodfacts", "images": [], "import_t": 1641562503, "manufacturer": 1, "name": "openfoodfacts", "url": null }, { "fields": ["emb_codes"], "id": "org-openfoodfacts", "images": [], "import_t": 1641562722, "manufacturer": 1, "name": "openfoodfacts", "url": null }], "states": "en:to-be-completed, en:nutrition-facts-completed, en:ingredients-completed, en:expiration-date-to-be-completed, en:packaging-code-completed, en:characteristics-to-be-completed, en:origins-completed, en:categories-completed, en:brands-completed, en:packaging-to-be-completed, en:quantity-completed, en:product-name-completed, en:photos-to-be-uploaded", "states_hierarchy": ["en:to-be-completed", "en:nutrition-facts-completed", "en:ingredients-completed", "en:expiration-date-to-be-completed", "en:packaging-code-completed", "en:characteristics-to-be-completed", "en:origins-completed", "en:categories-completed", "en:brands-completed", "en:packaging-to-be-completed", "en:quantity-completed", "en:product-name-completed", "en:photos-to-be-uploaded"], "states_tags": ["en:to-be-completed", "en:nutrition-facts-completed", "en:ingredients-completed", "en:expiration-date-to-be-completed", "en:packaging-code-completed", "en:characteristics-to-be-completed", "en:origins-completed", "en:categories-completed", "en:brands-completed", "en:packaging-to-be-completed", "en:quantity-completed", "en:product-name-completed", "en:photos-to-be-uploaded"], "stores": "", "stores_tags": [], "teams": "pain-au-chocolat,shark-attack", "teams_tags": ["pain-au-chocolat", "shark-attack"], "traces": "", "traces_from_ingredients": "", "traces_from_user": "(en) ", "traces_hierarchy": [], "traces_lc": "en", "traces_tags": [], "unknown_ingredients_n": 0, "unknown_nutrients_tags": [], "update_key": "ing2", "vitamins_tags": [] }, "status": 1, "status_verbose": "product found" } ```

teolemon commented 2 years ago

@jasmeet0817 @monsieurtanuki I didn't adapt the query.
- The knowledge panels are most probably off-limits due to their sheer size
- It would probably be the summary card only (basic info + attributes to allow a personalized summary card)
- If that's still too big, we could reduce, but just the name & brand is obviously useless
@monsieurtanuki yes, the heating is totally unrelated and being taken care of already (long discussion about scanning during the last community meeting on Thursday evening). I was just giving it for context, as I remembered my discussion about top annoyances. Sorry for the ambiguity introduced.

monsieurtanuki commented 2 years ago

Yes I did have issues where I scanned products and then I was just waiting for it to load

@teolemon @jasmeet0817 For the low connectivity use-case, I suggest that we add in dev mode a switch between the current set of extracted fields, and a minimum set of fields. And then we send @jasmeet0817 go shopping :) (bad luck, it's crowded on Saturdays) What do you think of that, at least for test purposes? Faster scan, faster carousel. And when we go to the product page, we download all the fields (to be coded in a second step, if relevant).

Anyway, it's a bit paradoxical to use scores for fast assessment of products ("it's A, it's D") and flood users with tons of the detailed data from which the scores were computed. In a "more about it..." button, fair enough. But when you're in a busy supermarket with low (or expensive) connectivity, the faster the better.

And more or less that's similar to the OP: a downgraded mode, and a full mode. The difference is that "my" downgraded mode does not handle offline queries and doesn't imply pre-loading tons of data. I can work on that downgraded mode (= limited fields) / full mode.

Following is the current list of product fields we extract for products:

NAME
BRANDS
BARCODE
NUTRISCORE
FRONT_IMAGE
IMAGE_FRONT_SMALL_URL
IMAGE_FRONT_URL
IMAGE_INGREDIENTS_URL
IMAGE_NUTRITION_URL
IMAGE_PACKAGING_URL
SELECTED_IMAGE
QUANTITY
SERVING_SIZE
PACKAGING_QUANTITY
NUTRIMENTS
NUTRIENT_LEVELS
NUTRIMENT_ENERGY_UNIT
ADDITIVES
INGREDIENTS_ANALYSIS_TAGS
LABELS_TAGS
LABELS_TAGS_IN_LANGUAGES
ENVIRONMENT_IMPACT_LEVELS
CATEGORIES_TAGS_IN_LANGUAGES
LANGUAGE
ATTRIBUTE_GROUPS
STATES_TAGS
ECOSCORE_DATA
ECOSCORE_GRADE
ECOSCORE_SCORE
ENVIRONMENT_IMPACT_LEVELS

PS: my understanding of offline scanning hasn't changed, as described in the following video :) https://www.youtube.com/watch?v=oyll1XxKh-M

teolemon commented 2 years ago

For the other attributes, an example: my sister is gluten intolerant. So what she gives a damn about is the gluten attribute on the summary card.
There are several reasons why offline will probably be helpful:
- Speed, reliability and peace of mind (we have an insanely fast and minimum experience regardless of the network speed / absence, potential slowdowns or downtime with the servers)
- Metered networks (in some areas of the world, or while roaming, people choose to disable mobile data)
- Use cases like tablets

monsieurtanuki commented 2 years ago

@teolemon The list of fields I provided in the previous comment was the current one - I guess gluten is already there, probably in ATTRIBUTE_GROUPS.

I suggest that you define a list of product fields that you think we definitely need: with that we can estimate the volume of the full offline database for France.

The query for the 1000 first products with our current list of fields is: https://fr.openfoodfacts.org/api/v2/search?page_size=1000&fields=code,product_name,brands,nutrition_grade_fr,image_small_url,image_front_small_url,image_front_url,image_ingredients_url,image_nutrition_url,image_packaging_url,selected_images,quantity,serving_size,product_quantity,nutriments,additives_tags,nutrient_levels,nutriment_energy_unit,ingredients_analysis_tags,labels,labels_tags_,environment_impact_level_tags,categories_tags_,lang,attribute_groups,states_tags,ecoscore_grade,ecoscore_score,ecoscore_data

The resulting size is 16,840,056 bytes, for 1000 products. For 871104 products it means 14.5 Gb, with all the current fields.

teolemon commented 2 years ago

https://fr.openfoodfacts.org/api/v2/search?page_size=1000&fields=code,product_name,brands,nutrition_grade_fr,image_front_small_url,quantity,lang,attribute_groups,ecoscore_grade

Probably this to display only the summary card ?

teolemon commented 2 years ago

We could remove images alltogether. What would be the cost of having 1K to 10K of those ? https://images.openfoodfacts.org/images/products/20301415/front_fr.37.100.jpg

Edit: 2,7kb*1-10k >> 30mb (possibly times 3 to 4 if we want to show which images are available)

teolemon commented 2 years ago

There are some fields that we possibly don't need anymore thanks to knowledge panels and attributes

monsieurtanuki commented 2 years ago

8.5Mb for https://fr.openfoodfacts.org/api/v2/search?page_size=1000&fields=code,product_name,brands,nutrition_grade_fr,image_front_small_url,quantity,lang,attribute_groups,ecoscore_grade

That means 7Gb for the 900K records.

Now that you mention the images, let me point that image data (png, jpg) are not even included, just the urls.

teolemon commented 2 years ago

yup, I know, we could save space by removing image urls, or opposedly we could decide to let the user even get images. But probably not for 900k records.

jasmeet0817 commented 2 years ago

If we really want offline storage, then the best would be to store the bare minimum data for only the top X popular products in a country.

On Sat, Jan 29, 2022, 15:37 monsieurtanuki @.***> wrote:

8.5Mb for https://fr.openfoodfacts.org/api/v2/search?page_size=1000&fields=code,product_name,brands,nutrition_grade_fr,image_front_small_url,quantity,lang,attribute_groups,ecoscore_grade

That means 7Gb for the 900K records.

Now that you mention the images, let me point that image data (png, jpg) are not even included, just the urls.

— Reply to this email directly, view it on GitHub https://github.com/openfoodfacts/smooth-app/issues/18#issuecomment-1024923448, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRHYQTK2LLHAXX4EKEE7LUYP3RZANCNFSM4SCUNPYQ . You are receiving this because you were mentioned.Message ID: @.***>

monsieurtanuki commented 2 years ago

If we really want offline storage, then the best would be to store the bare minimum data for only the top X popular products in a country.

@jasmeet0817 @teolemon Looks like a very good idea! We could even download everything for those top X popular products.

Additional data following...

Without image_front_small_url: 8,442,485 bytes per 1K products - we don't win anything https://fr.openfoodfacts.org/api/v2/search?page_size=1000&fields=code,product_name,brands,nutrition_grade_fr,quantity,lang,attribute_groups,ecoscore_grade

Without attribute_groups: 281,493 bytes per 1K products - we do win a lot: 30 times smaller! https://fr.openfoodfacts.org/api/v2/search?page_size=1000&fields=code,product_name,brands,nutrition_grade_fr,quantity,lang,ecoscore_grade,image_front_small_url

The thing is that attribute_groups are too fat and redundant. For instance, the first attribute of the first product:

{
    "description": "", // we don't need
    "icon_url": "https://static.openfoodfacts.org/images/attributes/nutriscore-a.svg", // we can use a reference
    "name": "Nutri-Score", // we can use a reference
    "title": "Nutri-Score A", // we can use a reference
    "grade": "a", // mandatory
    "id": "nutriscore", // mandatory
    "match": 100, // mandatory
    "status": "known", // we can assume that the status is known if there's a match > 0
    "description_short": "Très bonne qualité nutritionnelle" // we probably can use a reference in most cases
  }

I tried to "simplify" the attributes of the first product, and I compressed from 8635 to 2308 bytes (modulo the \n)

before

```json [{ "attributes": [{ "description": "", "description_short": "Très bonne qualité nutritionnelle", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/nutriscore-a.svg", "id": "nutriscore", "match": 100, "name": "Nutri-Score", "status": "known", "title": "Nutri-Score A" }, { "description_short": "0.021 g / 100 g", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/nutrient-level-salt-low.svg", "id": "low_salt", "match": 97.2, "name": "Sel", "status": "known", "title": "Sel en faible quantité" }, { "description_short": "0 g / 100 g", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/nutrient-level-fat-low.svg", "id": "low_fat", "match": 100, "name": "Matières grasses", "status": "known", "title": "Matières grasses en faible quantité" }, { "description_short": "0 g / 100 g", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/nutrient-level-sugars-low.svg", "id": "low_sugars", "match": 100, "name": "Sucres", "status": "known", "title": "Sucres en faible quantité" }, { "description_short": "0 g / 100 g", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/nutrient-level-saturated-fat-low.svg", "id": "low_saturated_fat", "match": 100, "name": "Acides gras saturés", "status": "known", "title": "Acides gras saturés en faible quantité" }], "id": "nutritional_quality", "name": "Qualité nutritionnelle" }, { "attributes": [{ "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-gluten.svg", "id": "allergens_no_gluten", "match": 100, "name": "Gluten", "status": "known", "title": "Ne contient pas : Gluten" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-milk.svg", "id": "allergens_no_milk", "match": 100, "name": "Lait", "status": "known", "title": "Ne contient pas : Lait" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-eggs.svg", "id": "allergens_no_eggs", "match": 100, "name": "Œufs", "status": "known", "title": "Ne contient pas : Œufs" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-nuts.svg", "id": "allergens_no_nuts", "match": 100, "name": "Fruits à coque", "status": "known", "title": "Ne contient pas : Fruits à coque" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-peanuts.svg", "id": "allergens_no_peanuts", "match": 100, "name": "Arachides", "status": "known", "title": "Ne contient pas : Arachides" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-sesame-seeds.svg", "id": "allergens_no_sesame_seeds", "match": 100, "name": "Graines de sésame", "status": "known", "title": "Ne contient pas : Graines de sésame" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-soybeans.svg", "id": "allergens_no_soybeans", "match": 100, "name": "Soja", "status": "known", "title": "Ne contient pas : Soja" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-celery.svg", "id": "allergens_no_celery", "match": 100, "name": "Céleri", "status": "known", "title": "Ne contient pas : Céleri" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-mustard.svg", "id": "allergens_no_mustard", "match": 100, "name": "Moutarde", "status": "known", "title": "Ne contient pas : Moutarde" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-lupin.svg", "id": "allergens_no_lupin", "match": 100, "name": "Lupin", "status": "known", "title": "Ne contient pas : Lupin" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-fish.svg", "id": "allergens_no_fish", "match": 100, "name": "Poisson", "status": "known", "title": "Ne contient pas : Poisson" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-crustaceans.svg", "id": "allergens_no_crustaceans", "match": 100, "name": "Crustacés", "status": "known", "title": "Ne contient pas : Crustacés" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-molluscs.svg", "id": "allergens_no_molluscs", "match": 100, "name": "Mollusques", "status": "known", "title": "Ne contient pas : Mollusques" }, { "debug": "1 ingredients (0 unknown)", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/no-sulphur-dioxide-and-sulphites.svg", "id": "allergens_no_sulphur_dioxide_and_sulphites", "match": 100, "name": "Anhydride sulfureux et sulfites", "status": "known", "title": "Ne contient pas : Anhydride sulfureux et sulfites" }], "id": "allergens", "name": "Allergènes", "warning": "Il est toujours possible que les données sur les allergènes soient manquantes, incomplètes, incorrectes ou que la composition du produit ait changé. Si vous êtes allergique, vérifiez toujours les informations sur l'emballage réel du produit." }, { "attributes": [{ "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/vegan.svg", "id": "vegan", "match": 100, "name": "Végétalien", "status": "known", "title": "Végétalien" }, { "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/vegetarian.svg", "id": "vegetarian", "match": 100, "name": "Végétarien", "status": "known", "title": "Végétarien" }, { "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/palm-oil-free.svg", "id": "palm_oil_free", "match": 100, "name": "Sans huile de palme", "status": "known", "title": "Sans huile de palme" }], "id": "ingredients_analysis", "name": "Ingrédients" }, { "attributes": [{ "description": "", "description_short": "Aliments non transformés ou minimalement transformés", "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/nova-group-1.svg", "id": "nova", "match": 100, "name": "Groupe NOVA", "status": "known", "title": "NOVA 1" }, { "grade": "a", "icon_url": "https://static.openfoodfacts.org/images/attributes/0-additives.svg", "id": "additives", "match": 100, "name": "Additifs", "status": "known", "title": "Sans additifs" }], "id": "processing", "name": "Transformation des aliments" }, { "attributes": [{ "description": "", "description_short": "Impact environnemental inconnu", "grade": "unknown", "icon_url": "https://static.openfoodfacts.org/images/attributes/ecoscore-unknown.svg", "id": "ecoscore", "match": 0, "name": "Éco-Score", "status": "unknown", "title": "Eco-Score non calculé" }, { "description": "", "description_short": "Pour l'instant seulement pour les produits avec du poulet ou des oeufs", "grade": "e", "icon_url": "https://static.openfoodfacts.org/images/attributes/forest-footprint-not-computed.svg", "id": "forest_footprint", "match": 0, "name": "Empreinte forêt", "status": "known", "title": "Empreinte forêt non calculée" }], "id": "environment", "name": "Environnement" }, { "attributes": [{ "description": "L'agriculture biologique vise à protéger l'environnement et à conserver la biodiversité en prohibant ou limitant l'utilisation d'engrais synthétiques, de pesticides et d'additifs alimentaires.", "description_short": "Les produits bios encouragent la durabilité écologique et la biodiversité.", "grade": "e", "icon_url": "https://static.openfoodfacts.org/images/attributes/not-organic.svg", "id": "labels_organic", "match": 0, "name": "Agriculture biologique", "status": "known", "title": "Pas un produit bio" }, { "description": "Quand vous achetez des produits du commerce équitable, les producteurs dans les pays en développement sont payés un prix plus haut et plus équitable, ce qui les aide à atteindre des plus hauts standards sociaux et environnementaux et à les conserver.", "description_short": "Les produits du commerce équitable aident les producteurs des pays en voie de développement.", "grade": "e", "icon_url": "https://static.openfoodfacts.org/images/attributes/not-fair-trade.svg", "id": "labels_fair_trade", "match": 0, "name": "Commerce équitable", "status": "known", "title": "Ne provient pas du commerce équitable" }], "id": "labels", "name": "Labels" }] ```

simplified version

```json [{ "attributes": [{ "grade": "a", "id": "nutriscore", "match": 100 }, { "description_short": "0.021 g / 100 g", "grade": "a", "id": "low_salt", "match": 97.2 }, { "description_short": "0 g / 100 g", "grade": "a", "id": "low_fat", "match": 100 }, { "description_short": "0 g / 100 g", "grade": "a", "id": "low_sugars", "match": 100 }, { "description_short": "0 g / 100 g", "grade": "a", "id": "low_saturated_fat", "match": 100 }], "id": "nutritional_quality", "name": "Qualité nutritionnelle" }, { "attributes": [{ "grade": "a", "id": "allergens_no_gluten", "match": 100 }, { "grade": "a", "id": "allergens_no_milk", "match": 100 }, { "grade": "a", "id": "allergens_no_eggs", "match": 100 }, { "grade": "a", "id": "allergens_no_nuts", "match": 100 }, { "grade": "a", "id": "allergens_no_peanuts", "match": 100 }, { "grade": "a", "id": "allergens_no_sesame_seeds", "match": 100 }, { "grade": "a", "id": "allergens_no_soybeans", "match": 100 }, { "grade": "a", "id": "allergens_no_celery", "match": 100 }, { "grade": "a", "id": "allergens_no_mustard", "match": 100 }, { "grade": "a", "id": "allergens_no_lupin", "match": 100 }, { "grade": "a", "id": "allergens_no_fish", "match": 100 }, { "grade": "a", "id": "allergens_no_crustaceans", "match": 100 }, { "grade": "a", "id": "allergens_no_molluscs", "match": 100 }, { "grade": "a", "id": "allergens_no_sulphur_dioxide_and_sulphites", "match": 100 }], "id": "allergens" }, { "attributes": [{ "grade": "a", "id": "vegan", "match": 100 }, { "grade": "a", "id": "vegetarian", "match": 100 }, { "grade": "a", "id": "palm_oil_free", "match": 100 }], "id": "ingredients_analysis", "name": "Ingrédients" }, { "attributes": [{ "grade": "a", "id": "nova", "match": 100 }, { "grade": "a", "id": "additives", "match": 100 }], "id": "processing", "name": "Transformation des aliments" }, { "attributes": [{ "grade": "unknown", "id": "ecoscore", "match": 0, "status": "unknown" }, { "grade": "e", "id": "forest_footprint", "match": 0, "name": "Empreinte forêt", "status": "known", "title": "Empreinte forêt non calculée" }], "id": "environment", "name": "Environnement" }, { "attributes": [{ "grade": "e", "id": "labels_organic", "match": 0, "status": "known" }, { "grade": "e", "id": "labels_fair_trade", "match": 0, "status": "known" }], "id": "labels", "name": "Labels" }] ```

monsieurtanuki commented 2 years ago

I've just simplified again the attributes, this time "à la SQL", and it looks less poetic but much more compact (should take half the space - for attributes):

SQL version

``` product(id, barcode) 0, 1234567890123 attribute (id, id_text) 0, nutriscore 1, low_salt 2, low_fat 3, low_sugars 4, low_saturated_fat 5, allergens_no_gluten 6, allergens_no_milk 7, allergens_no_eggs 8, allergens_no_nuts 9, allergens_no_peanuts 10, allergens_no_sesame_seeds 11, allergens_no_soybeans 12, allergens_no_celery 13, allergens_no_mustard 14, allergens_no_lupin 15, allergens_no_fish 16, allergens_no_crustaceans 17, allergens_no_molluscs 18, allergens_no_sulphur_dioxide_and_sulphites 19, vegan 20, vegetarian 21, palm_oil_free 22, nova 23, additives 24, ecoscore 25, forest_footprint 26, labels_organic 27, labels_fair_trade attribute_product (product_id, attribute_id, match, grade, description_short, status): 0, 0, 100, 'a', '' 0, 1, 97.2, 'a', "0.021 g / 100 g" 0, 2, 100, 'a', "0 g / 100 g" 0, 3, 100, 'a', "0 g / 100 g" 0, 4, 100, 'a', "0 g / 100 g" 0, 5, 100, 'a', '' 0, 6, 100, 'a', '' 0, 7, 100, 'a', '' 0, 8, 100, 'a', '' 0, 9, 100, 'a', '' 0, 10, 100, 'a', '' 0, 11, 100, 'a', '' 0, 12, 100, 'a', '' 0, 13, 100, 'a', '' 0, 14, 100, 'a', '' 0, 15, 100, 'a', '' 0, 16, 100, 'a', '' 0, 17, 100, 'a', '' 0, 18, 100, 'a', '' 0, 19, 100, 'a', '' 0, 20, 100, 'a', '' 0, 21, 100, 'a', '' 0, 22, 100, 'a', '' 0, 23, 100, 'a', '' 0, 24, 0, 'unknown', '', 'unknown' 0, 25, 0, 'e', '', 'known' 0, 26, 0, 'e', '', known' 0, 27, 0, 'e', '', known' ```

As I said before you don't change a SQL database structure like you do with JSON files, therefore we should be very careful with what we really want:

a much smaller space than the current 45Kb json for each product
a limited set of data - barcode, name and attributes? categories?
a vague idea of what we'll use the data for, especially regarding the whereclauses - obviously on the barcode, but on the name? on the categories? on some attributes?

monsieurtanuki commented 1 month ago

@teolemon @g123k continuing from #5392

I would compute the number of products (1M for France?) multiplied by the size of each product (in a mini version: barcode, name, main image), and then I'll realize that the server doesn't accept more than 10 queries a minute.

That would also mean a specific MiniProduct table.

Actually I don't know the use-case, or more precisely: how minified should the product version be?

teolemon commented 1 month ago

I think we should drop images altogether, in favor of less size or more info (eg image status, attributes…) Core use cases are:

scanning in a supermarket with no network and getting the score
core contributors taking photos of products during a scan party (does the product exist, and potentially do we have photos for it)

We should steer away from search queries, and generate a one-size fits all mini dump for each country with product_name, nutriscore, ecoscore, nova_group, and possibly: attributes, states. We could slice the mini-dump based on user prefs (and remove some attributes)

monsieurtanuki commented 1 month ago

scanning in a supermarket with no network and getting the score

That could even be the subtitle of a new app! I used an offline map app years ago and the first step was to select which countries to download.

does the product exist, and potentially do we have photos for it

In both cases, if we want to put that inside Smoothie that should be in distinct pages, at least in a first step:

hey user, what do you need to download?
background task downloading it
when it's done, specific scan page with limited actions

monsieurtanuki commented 1 month ago

@teolemon you can emulate the no-network-scan-session after downloading a significative set of products (cf. dev mode / offline) and switching to flight mode. Immediate results. Then you can think about which data a user would really need in this or that use case.

openfoodfacts / smooth-app

📶 🤳 Offline scanning (Tracker) #18

What

Why

Part of