openvenues / pypostal

Python bindings to libpostal for fast international address parsing/normalization
MIT License
766 stars 88 forks source link

near_dupe_hashes returns empty list #49

Open MrNerdy42 opened 5 years ago

MrNerdy42 commented 5 years ago

The near_dupe_hashes function from the module postal.near_dupe seems to invariably return an empty list.

Here are some examples of US addresses I have tried. The tokens and labels were obtained with parse_address. The parameters do not change the output, but I have included them just in case

near_dupe_hashes(['house_number', 'road', 'city', 'state', 'postcode'],['209', 'st michaels circle', 'odenton', 'md', '21113'], with_city_or_equivalent=True, with_postal_code=True)

near_dupe_hashes(['house_number', 'road', 'city', 'state', 'postcode'],['1', 'six flags blvd', 'jackson township', 'nj', '08527'])

near_dupe_hashes(['house_number', 'road', 'city', 'state', 'postcode'], ['1313', 'disneyland dr', 'anaheim', 'ca', '92802'])

dan197306 commented 5 years ago

Same here getting the same issue

rabbiveesh commented 5 years ago

I solved this by adding the address_only_keys flag to the call. It would be nice if this API had ANY documentation.

MrNerdy42 commented 5 years ago

I submitted an issue for documentation as well, but it does not seem like anyone monitors this page. Also, I'm no longer working on the project that I needed pypostal for, but I'll try this work around if I get the chance.

mhmdmodan commented 3 years ago

Hi all, I did some digging and it looks like actually near_dupe_hashes is working as expected; the documentation is just terrible. I explain this in more detail in a reply of my duplicate issue here: https://github.com/openvenues/pypostal/issues/60#issuecomment-860910059

This issue can be closed