revan / RU-Food-Scraper

Scrapes basic nutrition information for Rutgers menus into JSON.
http://vps.rsopher.com/nutrition.json
3 stars 5 forks source link

Correct formatting issues #2

Open dbordak opened 10 years ago

dbordak commented 10 years ago

As it is, some items have incorrectly formatted descriptions. As an example, 'SHRIMP SCAMPI' has two spaces instead of one. While these issues are completely unpredictable due to the origin of these descriptions, attempting to fix them would probably make the API more valuable.

revan commented 10 years ago

Regex replace multiple spaces with one?

dbordak commented 10 years ago

That was just an example. This is a bit of a broader issue, i.e. you should recognize what kinds of formatting issues come up often and attempt to fix them.

But yeah, I'm sure there's a regex to reduce variable amounts of whitespace into a single space, which is probably what you'd want here.