q-m / food-ingredient-parser-ruby

Extract the structure of ingredient lists on food products
MIT License
16 stars 2 forks source link

Fix colon with bracketed amount #6

Closed wvengen closed 3 years ago

wvengen commented 6 years ago

The ingredients list sauce: (50%) tomato, salt is incorrectly parsed by both the strict and loose parser.

strict: no result loose: {:contains=>[{:name=>"sauce", :amount=>"50%"}, {:name=>"salt"}]}

This would be good to fix.

wvengen commented 6 years ago

This happens in about 0.02% of the ingredient lists. For examples, run cat data/ingredient-samples-nl | grep -v ':\\n' | grep ':\s*[(\[]\s*[0-9,.]\+\s*\(%\|g\|gr\|gram\|ml\)\.\?\s*[])]'.

wvengen commented 3 years ago

This is fixed now with #15.