q-m / food-ingredient-parser-ruby

Extract the structure of ingredient lists on food products
MIT License
16 stars 2 forks source link

Handle separator in "stabilisatoren: e407-e412-e415" #12

Closed wvengen closed 5 years ago

wvengen commented 5 years ago

Sometimes dash is separator: stabilisatoren: e407-e412-e415 (but not always: kleurstof: paprika-extract). Handle this separator.

Happens in about 0.2% of ingredient lists. Run grep -i ':\s*e[0-9]\+-e[0-9]\+' data/ingredient-samples-nl for examples.