opendata / Open-Data-Needs

An ongoing effort to catalog the holes in the open data ecosystem. [RETIRED]
15 stars 0 forks source link

Physical address normalization #1

Open waldoj opened 10 years ago

waldoj commented 10 years ago

None of the code to normalize addresses are up to snuff. That is, none are even close enough to be able to get CASS certification. The open data ecosystem needs a high-quality open source address correction tool. Better still—and this would be amazing—it needs such a tool that is actually CASS certified. I have no idea what the costs are that are associated with that, but would qualify users for USPS postage discounts.

It's important to note that, since 2007, CASS software has needed to verify that an address actually exists, and also needs to check with the Locatable Address Conversion System to see if an address has changed. I haven't done anything with CASS since 2007, so I don't have any experience with this. My guess is that licensing LACS is expensive, and it might only be realistic to create software that adheres to CASS pre-2007.

TonyPapousekFP commented 10 years ago

Not sure if it's of any use, but the USPS now offers an address information api that will spit out XML.

waldoj commented 10 years ago

We're approaching the one year anniversary of my application to use their API. I've never heard back. From my conversations with others who have applied, this is standard. :(

TonyPapousekFP commented 10 years ago

I might be worth giving them a phone call. Their contact info is inside the API's documentation. I have a feeling with enough pestering they'll speed up the application process.

drwelden commented 9 years ago

Attempting to research this myself and found a useful bit of information I figured I should leave here. As mentioned above, CASS certification now requires a LACS check. Other than licensing costs, even if there were some kind open source philanthropist, the license to use LACS prohibits exposing the data, meaning it is impossible to have a 100% open source CASS Certified library.

waldoj commented 9 years ago

Well, that's a mighty shame. I guess any solution, then, would just have to be as close as is possible, under the circumstances. Thanks for that update!

Downchuck commented 9 years ago

At some point this project may gather enough data: http://openaddresses.io/

Mathnerd314 commented 7 years ago

Here's a library: https://github.com/openvenues/libpostal Someone with more spare time than me could check how it does on the CASS stage 1 dataset