Open teolemon opened 1 year ago
@raphael0202 @stephanegigandet one reuser just told us that he successfully fed OCR into ChatGPT and output a nutrition JSON. My guess is that he might have provided a category to aid the model. I've made a quick n' dirty test even without a category that seems to work well. https://images.openfoodfacts.org/images/products/433/725/629/5260/2.json https://images.openfoodfacts.org/images/products/433/725/629/5260/2.jpg
Can you parse the nutrition facts from this string into a JSON ? Durchschnittliche\nNährwerte\n%-RM*\npro 100g\n2086 kJ\n499 kcal\n26,0g\npro 100g\nEnergie\n25%\n37%\nFett\ndavon gesättigte\nFettsäuren\nKohlenhydrate\ndavon Zucker\nEiweiß\nSalz\n14,0g\n58,0g\n32,0g\n6,2g\n0,71g\n70%\n22%\n36%\n12%\n12%\n*RM = Referenzmenge für einen durchschnitt-\nlichen Erwachsenen (8400 kJ/2000kcal).\n
Based on the information given, it seems that you want to convert the provided German nutrition facts into a JSON object. I will assume that the values next to "pro 100g" and "%-RM*" respectively are the actual values per 100g and the percentages of the recommended daily intake.
Here is how you can represent it:
{
"Nährwerte": {
"Energie": {
"per_100g": {
"kJ": 2086,
"kcal": 499
},
"daily_percentage": 25
},
"Fett": {
"per_100g": "26.0g",
"daily_percentage": 37
},
"gesättigte Fettsäuren": {
"per_100g": "14.0g",
"daily_percentage": 70
},
"Kohlenhydrate": {
"per_100g": "58.0g",
"daily_percentage": 22
},
"Zucker": {
"per_100g": "32.0g",
"daily_percentage": 36
},
"Eiweiß": {
"per_100g": "6.2g",
"daily_percentage": 12
},
"Salz": {
"per_100g": "0.71g",
"daily_percentage": 12
}
},
"Referenzmenge": "8400 kJ/2000kcal"
}
Following a discussion with @raphael0202, We might want to see if our classic nutrient detection could be providing on-par performance, before rushing into this.
Not on this product: https://robotoff.openfoodfacts.org/api/v1/predict/nutrient?ocr_url=%20https://images.openfoodfacts.org/images/products/433/725/629/5260/2.json As the nutritional values are not next to the nutrition mentions in the OCR text.
GPT3.5 output (incorrect prediction)
{
"Nährwerte": {
"Energie": {
"pro 100g": "2086 kJ",
"%-RM*": "25%",
"kcal": "499"
},
"Fett": {
"pro 100g": "26,0g",
"davon gesättigte Fettsäuren": "58,0g",
"%-RM*": "37%"
},
"Kohlenhydrate": {
"pro 100g": "32,0g",
"davon Zucker": "6,2g",
"%-RM*": "12%"
},
"Eiweiß": {
"pro 100g": "14,0g",
"%-RM*": "12%"
},
"Salz": {
"pro 100g": "0,71g",
"%-RM*": "12%"
}
},
"Referenzmenge": "RM = Referenzmenge für einen durchschnittlichen Erwachsenen (8400 kJ/2000kcal)."
}
What
Part of
289