Quartz / bad-data-guide

An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
4.05k stars 404 forks source link

Suggest OpenRefine for spelling errors #19

Closed mdlincoln closed 8 years ago

mdlincoln commented 8 years ago

I'm not sure how much you want to open this guide up to discussing specific methods for fixing these problems, but given that you already mention manual spelling correction, it might be worth noting OpenRefine's clustering function as an invaluable aid for working through that particular data issue.

:+1: for a marvelous guide!

onyxfish commented 8 years ago

Thanks, Matthew! I think this is general enough information to include in the guide.