MacHu-GWU / uszipcode-project

USA zipcode programmable database, includes up-to-date census and geometry information.
MIT License
231 stars 49 forks source link

Partial results in common_city_list #36

Open abhishekperambai opened 4 years ago

abhishekperambai commented 4 years ago

while trying to query the common_city_list using zipcode, package gives partial results.

Ex: for zipcode 99505 --> common_city_list

search.by_zipcode("99505").to_dict()["common_city_list"]
['Jber', 'Anchorage', 'Fort Richardson', 'Ft Richardso...']
mPyth commented 2 years ago

Among 51 US states (50 states + District of Columbia) there are 665 zip codes with partial field common_city_list (finished with three dots). This is obvious bug.

Here are some of them:

98155: Seattle, Lake Forest Park, Lk Forest Pk, Shorelin...
10591: Tarrytown,N Tarrytown, North Tarrytown, Sleepy Hol...
18424: Gouldsboro, Clifton, Clifton Township, Clifton Twp, ...
26525: Bruceton Mills, Brandonville, Bruceton Mls, Cuzzart, Haz...
32081: Ponte Vedra, Ponte Vedra Beach, Tn Of Nocatee, Town O...
41465: Salyersville, Bethanna,Burning Fork, Carver, Cisco, C...

In python file uszipcode/db.py there are reference to used zip code table: http://federalgovernmentzipcodes.us/ . In that table these information are present (in another format).

Zip with the most 'acceptable' cities (aka common_city_list) is 41465 (last from previous list) - there are 31 such cities. Here are its common_city_list (primary city is Salyersville): Bethanna, Burning Fork, Carver, Cisco, Conley, Cutuno, Cyrus, Duco, Edna, Elsie, Ever, Flat Fork, Foraker, Fredville, Fritz, Gapville, Gifford, Hager, Harper, Hendricks, Ivyton, Lickburg, Logville, Maggard, Marshallville, Mashfork, Seitz, Stella, Sublett, Swampton, Wonnie. Interesting fact is that each of these 31 cities represent valuable information, each of them is real village and not dummy abbreviation (in other zip codes half of them are unusable abbreviations).

Problems with mentioned table (http://federalgovernmentzipcodes.us/): * it contains 64 ZIP codes less than uszipcode * precision of longitude and latitude in both databases has precision of only 1/100 of the degree which is 1000m in Florida and 750m in North Dakota (in longitude direction).