bcgov / ols-geocoder

Physical Address Geocoder
Apache License 2.0
10 stars 6 forks source link

Compound word dictionary used in data prep to split official street names has gaps and irrelevent splits #197

Open mraross opened 3 years ago

mraross commented 3 years ago

Using the geocoder 4.1 test environment located at:

https://cmhodgson.github.io/ols-devkit/ols-demo/?gc=tst

Enter the following:

  birds eye

and the geocoder returns:

 Duncan, BC

instead of the correct match:

 Birdseye Dr, Duncan, BC

but glue birds and eye together and hill together and you get the correct match.

This is because the compound word dictionary doesn't include the word birdseye. This dictionary wasn't designed for street names so there are many similar omissions.

More examples of missing words in the compound word dictionary:

 Hillcrest
 Searidge
mraross commented 3 years ago

Is being handled by #208