HackJersey / garden-style

A Python library to clean and standardize NJ town names.
MIT License
0 stars 1 forks source link

Gather ~2,000-5,000 record corpus of dirty town names to test against. #2

Open tommeagher opened 11 years ago

esagara commented 11 years ago

I will put this together. I ran a quick query on the voter registration database and the names seem to be a bit more standardized than I remember. It could be that I cleaned it already. Another option is pulling from another, much dirtier data source, such as FEC campaign finance data. That will give us not only variations in words like mount, borough and township but also misspellings of town names.

tommeagher commented 11 years ago

Any luck?

tommeagher commented 9 years ago

Added @CarlaAstudillo & @sstirling to this repo. Obviously, Eric and I didn't get very far with this. But if you two find any compelling use case for something like this, let me know. I'd be interested in resurrecting this. If not, no big deal. Got plenty to keep us busy.

sstirling commented 9 years ago

I'm sure we'll stumble across something before long.

tommeagher commented 9 years ago

Sorry if you got hit with a bunch of alerts, @esagara, @sstirling, @CarlaAstudillo. Just transferred this repo to HackJersey. No obligation for you. Just wanted to make sure you still have access if you want to contribute.