Open FoodCoach-App opened 2 years ago
That's a very interesting topic. The CSV upload is in fact the easier part: we have that in our platform for producers: https://world.pro.openfoodfacts.org There's a CSV template, and we support custom files too.
It brings a lot of questions though:
In addition, something we could do is to add the PLU numbers to the corresponding categories in the taxonomy category.
I'll answer these questions based on how I'd like to use the data for my platform FoodCoach.
Further Recommendation a. I'd recommend that that the "serving size" be "one" of the produce item (one apple, one onion, one melon) and then grams / 100 grams be filled out as normal. The product size would be the average mass of a single apple, onion or melon.
I would be happy to upload this data as described myself, and would only need a corporate account. I would like to ensure that what I have described is reasonable, consistent with the goals of OFF and that the data format makes sense.
On Thu, Nov 24, 2022, 11:02 AM Pierre Slamich @.***> wrote:
- Related, this old 2016 bug from me: #561 https://github.com/openfoodfacts/openfoodfacts-server/issues/561
— Reply to this email directly, view it on GitHub https://github.com/openfoodfacts/openfoodfacts-server/issues/7735#issuecomment-1326630154, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4J6ZUFKYBVYJ4DBSMDWZJ3WJ6GPXANCNFSM6AAAAAASGWF4RQ . You are receiving this because you authored the thread.Message ID: @.***>
@stephanegigandet maybe we should prefix that kind of code, and have an ID like plu-00000. By default we know digits are EAN. I think this ask for a small rework in actual code (in code validation function).
I think importing this kind of database is really in tune with our project goal.
I would put "USDA database" as the brand
@stephanegigandet maybe we should prefix that kind of code, and have an ID like plu-00000.
we could so something similar for the products without barcodes, and prefix them with off-
one question is what to do with the USDA FDC entries: do we import them separately? same for ciqual?
so we could have one "PLU" orange, one "USDA FDC" orange, one "CIQUAL" orange.
another option is to have only one PLU orange, and then complete its nutrient data from either USDA FDC or CIQUAL.
so we could have one "PLU" orange, one "USDA FDC" orange, one "CIQUAL" orange.
Personally I prefer this approach. A data source is like a brand / producer.
Otherwise it might be quite funky to run updates.
Then if we want, in the future, to have a special "Orange" that mix them up, we could have it also independently.
I don't understand the desire to separate the nutrition information by data source.
If a user searches for a specific type of apple (4015 for example), they don't care if the information from USDA or CIQUAL.
The USDA and CIQUAL data should be nearly identical anyway.
We can list the source, but when I search for 4015, I'd expect it to return information just as it does today.
https://world.openfoodfacts.org/product/4015/red-delicious-apple
(I didn't put this item in the database and was surprised that all PLU codes weren't already included. )
PLU codes only have 4 or 5 digits, there won't be any confusion with the longer EAN barcodes and unlikely to be a need to include a PLU pre-fix.
A closer reading of some of the prior comments leads me to believe there is some confusion regarding the PLU codes.
A PLU code is for a specific name and type of commodity. A "Granny Smith" apple is a different PLU code from a "Red Delicious" apple, which is different from a "Fugi" apple. Hope this helps.
@FoodCoach-App, the separation I propose is more from a maintenance point of view, but also as sources may diverge. Saussage may not mean the same thing everywhere. I see that the international meaning of food is not so uniform, and may lead to surprise. Personally I think that making separate items is cleaner and easier because we can really take the exact name of each classification. That said we can have a special label, like "reference food", so that in a category you can quickly have all "reference food" (eg for "red apples" category).
Also the plu prefix, is more to think in the long term, when we may want to mix different type of code and the risk of clash becomes more elevated. For example CIQUAL code are also 5 digits and may clash with PLU codes.
@alexgarel Can a consumer in France or the EU purchase an item and the item states that it is CIQUAL code "12345"? The CIQUAL codes seem to be the database of nutrition information that is made easy to lookup with numeric identifiers.
Edit: Posted too soon.
The PLU code is readily visible to consumers purchasing produce in the US, and, if a bi-layer barcode is present on the PLU sticker, it is imbedded in it. The goal of this request is to link the information that the consumer sees to nutrition information, from any source.
The CIQUAL database looks like it is easy to search. Unless consumers see the CIQUAL number on the packaging, why would we need to duplicate it in OFF?
Specifically looking to upload all PLU codes from here (https://www.ifpsglobal.com/PLU-Codes) with the free to use USDA nutritional information from here (https://fdc.nal.usda.gov/fdc-app.html#/food-details/2346398/nutrients).
What's the license of the data? Can this "free to use" data be combined with DbCL content?
The USDA nutritional data-base is free to use. The IFPS data is user-submitted and the policy is here. https://www.ifpsglobal.com/Terms I'm not a lawyer, but I see no prohibition against using the data.
All of the data is easily downloadable as a .csv here: https://www.ifpsglobal.com/PLU-Codes/PLU-codes-Search
I have reached out to IFPS and asked this specific question.
@FoodCoach-App I was not aware that the PLU was written on products in the USA !
@alexgarel not a problem. This was my first introduction to CIQUAL data as well. Does this change your suggestion(s) regarding how to store nutritional data for PLU codes? A PLU is almost directly analogous to UPC (at least in the USA).
If it would help, here is the current status (incomplete) of the proposed data-base to add to OFF.
In the spreadsheet, columns A-E are from IFPS and F-M are from the USDA but could just as easily be from CIQUAL. The nutrition data is all listed as "g/100g" to make it easy to find the total nutritional value based on the weight/mass of the produce. I have not differentiated nutritional information between the varieties (fugi, granny smith, red delicious) for each commodity (apple); for now, all "apples" have the same nutritional information".
Does this change your suggestion(s) regarding how to store nutritional data for PLU codes?
Not really, I'm still more in favor of a separation by source, and on a prefix by type of code. But we can consider making an exception for PLU on the prefix.
We have permission from IFPS to use the data.
My request: What are the restrictions for reuse of the IFPS database? I'd like to marry the PLU's on your IFPS database with the USDA nutrition information and upload to OpenFoodFacts.com. Would this be permitted under your terms of use?
The Response: Hi Victor, It is fine to use the PLU codes as you mention in your email. If you could reference the IFPS Global website (https://www.ifpsglobal.com/PLU-Codes/PLU-codes-Search) for more information, that would be great. Best regards, Wendy
@alexgarel https://en.wikipedia.org/wiki/Price_look-up_code These codes also seem to be used not only in the USA, at least France: https://fr.wikipedia.org/wiki/Code_price_look-up
I'm also in favor to use prefixes.
Thanks @CharlesNepote!
I did a quick check, the examples on the wiki page match the PLU data-base that I have.
PLU 3024 = Poire Rocha (En: Rocha Pears) PLU 4173 = Petite pomme royal gala (En: Royal Gala Apples: Small) PLU 4174 = Grosse pomme royal gala (En: Royal Gala Apples: Large) PLU 4664 = Tomate rouge (En: Red Tomatoes on vine)
(edited to remove erroneous links to #4664 Slack conversations)
Question regarding the use of a pre-fix. How would a new user know to use a prefix, or what prefix to use? The PLU number is similar to a bar-code and is often the only information on the produce. I'd image that they would look it up just like I did, by searching for "3024" or "4015" as if it was the UPC/EAN number.
The issue as I understand it is that the PLU and CIQUAL numbers overlap. If the CIQUAL numbers don't appear on produce, why would a user search for the CIQUAL number on the OFF database? If they already had the CIQUAL number, wouldn't they search on the CIQUAL database?
Put another way, what is the use-case for someone searching for nutrition information from CIQUAL numbers on the OFF database?
Or am I missing the pre-fix concern entirely?
It's not specific to PLU and CIQUAL, small numbers are likely to conflict with something now or in the future. There's no cost in adding a prefix, and we can make the search work so that it returns a PLU 4664 when someone types in 4664. The most common use case is probably going to be to search fruits and vegetables by name, not by PLU, so what matters the most is that we do load the PLU items as products. Same for ciqual that has more than fruits and vegetables: fish, meat etc.
Got it. Thank you. On my end, I can make the FoodCoach API search the OFF database for 4053 (from the picture below).
How do we proceed?
Second Question, Still Related to PLU's: How does the OFF app handle reading GS1 Databar Stacked Omnidirectional barcode symbol stickers like this? The 4 digit PLU is imbedded in the 13 digit GTIN number string. I'd recommend cropping off the extraneous digits and simply returning the data for the 4 or 5 digit PLU code (the numbers in light blue on the second image).
Second Question, Still Related to PLU's: How does the OFF app handle reading GS1 Databar Stacked Omnidirectional barcode symbol stickers like this?
Well, is there a way to know that a GTIN-13 number follows that scheme? I think we should keep the original GTIN, but if we know that there is a PLU inside, we could link to it and/or use PLU data to complement it.
Yes, GTIN13 numbers follow that scheme.
GTIN12 numbers (used in the US) use this scheme.
https://www.gs1us.org/documents?Command=Core_Download&EntryId=554
The full GTIN contains the grower/manufacture which is not relevant for nutritional information.
What bar-code reading software does OFF use for stacked, GS-1 barcodes?
I'd like to move forward with this. Here is what I think we've agreed to thus-far.
Did I miss anything?
If we are in agreement, how do I register as a corporation to use the bulk upload feature?
@stephanegigandet I propose to add a USDA-PLU org and add @FoodCoach-App as an org admin, is that ok ? Or maybe @FoodCoach-App you want to make a specific account for this ?
Well the issue is that we don't support codes like plu-4325 yet. We could prepare the file, test it on the producers platform etc. already though.
I'll continue to prepare the file, consistent with what I think we've agreed on and closely matching the .csv file format I've uploaded to github already in this thread.
What is required to add the support of codes like "plu-4325" and who is able and interested in doing that work?
@FoodCoach-App, I have opened https://github.com/openfoodfacts/openfoodfacts-server/issues/7806
Thanks @alexgarel.
I work on continuing to populate the PLU to nutrition data-base. Anything else?
One thing to add to the list of actions on #7806 is allowing a user to search for 4015 and it return "plu-4015". Otherwise users not using apps wouldn't know how to find the nutrition data.
Happy Holidays everyone!
After the holidays, I'm thinking there are two ways that a user might want to find PLU information.
For 1, there would be no change. OFF would simply insert a "plu-" prefix and it would be the same For 2, OFF would need to find the 10-13th digits in the string, pull them out of the string and then insert the "plu-" prefix
@FoodCoach-App this would be kind of a fall-back mechanism ?
Long term, the users should be able to scan the bi-level GS1 barcode just like any regular barcode.
The bi-level GS1 barcode scanners required to read PLU stickers are difficult to find and are not in most barcode scanning apps.
I would image that the "best possible answer" would be that users would scan the bi-level GS1 barcode and OFF would return the nutrition information based on the 10-13th digits.
Here is an example of the PLU barcode. I've had a hard time finding barcode readers that can read it. Aspose.BarCode can, but I haven't been able to incorporate it into FoodCoach.
What barcode reader is in the OFF apps? Can it read this?
I'll download the app and give it a shot. The food won't be in the database, so I'm not sure yet what error code I'm going to get.
I've downloaded the app. The app does not find/scan/register the bi-level GS-1 barcodes as shown above. Several others do. The best case scenario would be to allow users to scan the bi-level barcode just as they do with all other barcodes. The first series of digits are country of origin and the grower; both are not need to be stored in OFF as the nutritional content will be the same.
That being said, I do like the app. Its very well put together, flows well and is intuative.
@FoodCoach-App I think you should open an issue on smooth-app repository for bi-level GS-1 barcode scanning. In this repository we deal with server side aspects.
Will do. I'll use the same language and image as above.
Sort of hairy topic. We have 'foods' and products. In the USDA database, a 'survey' food would be something like 'apple, raw' and the product would be that food from a certain grower in a certain country. Some products don't have a 'food' basis but are composed of ingredients. The nutrition of all the products for a certain food could be shared.
My two cents; either prefix everything or key the food database on code and codetype.
@chriswhiteoco I'm not sure if I understand your question. This request would be to add nutritional data for different types of foods / commodities as defined by PLU codes (the little sticker on fruits and vegetables). While the specific grower and the country code is included on the full PLU bi-level barcode, the only relevant data for nutrition content is the 4 digit PLU code.
The general agreement here was to add a "plu" prefix to the PLU codes. Long term, users should be able use a bi-level barcode scanner to read the full 13 digit code and then the OFF app or others can omit the first 9 digits, add the PLU designation and return the PLU nutrition information.
@FoodCoach-App I was thinking that because the code field is the key to the food table that it is important to avoid identifier collisions. Prefixing all codes with the code type would help with this. EAN-1234567890123, PLU-4053, etc.
Also, in lexicon applications, there is a root 'concept' then many specializations. For example, 'whole milk' would be a root concept for a food with an OFF identifier (or some other identifier like the PLU code) then there would be lots of 'product' entries that have 'whole milk' as their parent concept. This is what I meant when I was talking about how the USDA data has foods and products.
Many 'products' would also be root concepts as they are a unique packaging of ingredients.
Of course, the OFF data doesn't currently support this concept. It is pretty much a 'products' database that focuses on foods.
@chriswhiteoco Thanks, I understand now.
I understand that OFF doesn't currently support this concept, but I think it aligns well with the goals of OFF, to easily determine nutrition information by scanning a barcode and collecting nutrition information.
In my embodiment/usage of OFF, users are upset/confused why they can't scan the bi-level barcode on fresh produce to "get credit" for healthy choices.
At this point, its looking like I'm going no need to store the PLU to nutrition information outside of OFF.
@FoodCoach-App yes, supporting PLU aligns with goals on OFF, and we are willing to go in the direction of supporting prefix to be able to import USDA as you mentioned. But at this moment we are lacking developer time for this, sadly.
This was discussed in the OFF Days event today. Apologies for duplicating what has already been said, but the agreed action plan was as follows:
@stephanegigandet , @odtvince , @daims971
I love these updates!
On Sun, Oct 22, 2023 at 9:58 AM john-gom @.***> wrote:
This was discussed in the OFF Days event today. Apologies for duplicating what has already been said, but the agreed action plan was as follows:
- Add PLU codes and other attributes to the categories taxonomy, adding missing items where necessary
- Potentially do the same with CIQUAL codes (fill in the gaps) and maybe other data sources too
- Introduce a "generic product" attribute on the product data model and have historic search APIs default to not return these (as the code will not be a real barcode)
- Create 2 products for each PLU code, one for regular and one for organic (which ahs the "9" prefix). Use a "plu-" prefix for these
- Also do the same for CIQUAL and other sources
- The PLU CSV links to images which we could maybe use, or we could use generative AI to produce these
- Enhance the mobile app to recognise PLU barcodes and search with then using the defined prefix
- Potentially enhance text search ranking so that generic products appear first
@stephanegigandet https://github.com/stephanegigandet , @odtvince https://github.com/odtvince , @Daims971 https://github.com/Daims971
— Reply to this email directly, view it on GitHub https://github.com/openfoodfacts/openfoodfacts-server/issues/7735#issuecomment-1774102937, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4J6ZUAOT3CS5CYEE5ZALSLYAURA5AVCNFSM6AAAAAASGWF4RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUGEYDEOJTG4 . You are receiving this because you were mentioned.Message ID: @.***>
Interesting link: https://github.com/topics/food-classification
Description
Bulk CSV file "add" to database. Specifically looking to upload all PLU codes from here (https://www.ifpsglobal.com/PLU-Codes) with the free to use USDA nutritional information from here (https://fdc.nal.usda.gov/fdc-app.html#/food-details/2346398/nutrients). There are 1423 unique items on this list; I'd hate to have to do this manually.
Acceptance criteria
API or website based CVS reader that would take the CVS file and:
What would a demo look like
Notes
Tasks