The current threshold for triggering a quality error for a vitamin's amount is 105g per 100g. I think it is not precise enough and we can do much better!
On the screenshot below, only 5 values raise an error instead of all of them since the unit is wrong. Here the problem comes from the units, but it could come from a typo, like it often does in the nutritional facts errors.
Proposed solution
I suggest we establish a specific threshold for each vitamin so we can detect new quality errors. I've extracted the max values for each vitamin so we can set a threshold value above the max known value.
Retrievable data I've extracted from ciqual and usda (the proposed threshold here is 4 times greater that the maximal known value): nutrient max values.ods
Expected outcome
Many new errors would be raised but we would be able to fix the products' data more easily and then improve the overall quality of the database :)
Note: I've arbitrarily chosen a factor of 4 for the example but we should discuss together which value would be the best.
Problem
The current threshold for triggering a quality error for a vitamin's amount is 105g per 100g. I think it is not precise enough and we can do much better! On the screenshot below, only 5 values raise an error instead of all of them since the unit is wrong. Here the problem comes from the units, but it could come from a typo, like it often does in the nutritional facts errors.
Proposed solution
I suggest we establish a specific threshold for each vitamin so we can detect new quality errors. I've extracted the max values for each vitamin so we can set a threshold value above the max known value.
Retrievable data I've extracted from ciqual and usda (the proposed threshold here is 4 times greater that the maximal known value): nutrient max values.ods
Expected outcome
Many new errors would be raised but we would be able to fix the products' data more easily and then improve the overall quality of the database :)
Note: I've arbitrarily chosen a factor of 4 for the example but we should discuss together which value would be the best.