nysenate / USPS-AMS-WebService

A network wrapper around the USPS Address Matching System API library that utilizes JNI to connect a Java servlet with the AMS proprietary C library.
9 stars 6 forks source link

Convert full state names to 2-letter abbreviations in API requests #6

Open bobo333 opened 6 years ago

bobo333 commented 6 years ago

USPS AMS only accepts 2-character (plus null-terminator) strings for the state field. This can be seen in zip4.h

typedef struct tagZip4Context
{
    /*********** input data ***********/
    ...
    char     istai[2+1];             /* input state                          */
    ...
} ZIP4_PARM;

However, the api does not have this limitation on the state query parameter, which can lead to confusing results when sending a full state name. Ultimately, if a full state name is submitted, only the first 2 letters are used by USPS AMS. For example:

api/validate?&addr1=148+bennett+ave&city=waterbury&state=connecticut will return

...
  "address" : {
    "firm" : "",
    "addr1" : "148 BENNETT AVE",
    "addr2" : "",
    "city" : "WATERBURY",
    "state" : "CO",
    "zip5" : "",
    "zip4" : ""
  }
...

Where it's using CO for the state, because those are the first 2 letters in Connecticut. However, that is the abbreviation for Colorado, not Connecticut, and therefore doesn't find a match for the address, even though the address does exist in Connecticut.

Full state names should be converted to their appropriate 2-letter abbreviations before submission to USPS AMS, when possible.

bobo333 commented 6 years ago

@kzalewski I'm happy to submit a PR to add this functionality if you're open to it

kzalewski commented 6 years ago

Hi @bobo333. Anthony has committed a fix for the issue that you reported. You are correct: Our code was definitely truncating the state name down to two characters, in order to comply with the ZIP4_PARM struct as defined in the AMS API. With his fix, we are now mapping full state names into state abbreviations, before passing them in to the AMS API address inquiry function.

However, I was reviewing the AMS API documentation, and I think I have a better solution. When specifying a city/state/zip, the z4adrinq() function allows those three items to be specified individually (via ictyi, istai, and izipc), as we currently do, or together as a single field in ictyi.

The benefit of combining city/state/zip into the ictyi field is that AMS will perform fuzzy string matching on the city and state. For example, I tried specifying an address in Wallingford, CT. I was able to use "Wallingford, Connecticut", but I could also use "Wallingfert, Connecticut" (misspelled city) and "Wallingford, Conecticut" (misspelled state), and AMS was able to correct the city and/or state.

So, rather than having our own software map a full state name to the corresponding state abbreviation, I think we should take advantage of the fuzzy string matching that is built into AMS.

My plan is to use the three individual fields (ictyi, istai, izipc) if the API request contains those same fields as individual components AND the state is specified as two characters. For all other cases, I think we should pass the city/state/zip into the singular ictyi field and let AMS work its magic.

What do you think?

bobo333 commented 6 years ago

Hi @kzalewski, thank you for the considered response. And thank you @aa29cala for the fix.

In general I like the idea of offloading as much as possible to AMS, it leaves less overhead to maintain and keep in sync, so I like your proposed solution.

My only question is what happens if a zip code isn't provided at all. Does the AMS ictyi field handle cases with city and state, but no zip code? Your examples seem to imply it works, but just double checking. If that is handled then this seems like a very comprehensive approach.

kzalewski commented 6 years ago

Yes, absolutely. In fact, most of my tests specified only a city and state (without a zip code) in the ictyi field. The AMS API corrected small errors in city and state names, and it returned the proper ZIP+4 code. It's actually quite good for an antiquated C-based API.

bobo333 commented 6 years ago

@kzalewski awesome! yes seems like a great solution to me, please let me know what I can do to help with implementation, happy to take a crack at it