foursquare / twofishes

MOVED - The project is still under development but this page is deprecated.
https://github.com/foursquare/fsqio
Other
434 stars 62 forks source link

Postalcodes are not included in alternate names #91

Open DeaconDesperado opened 8 years ago

DeaconDesperado commented 8 years ago

The geonames database exposes US postal codes as an alternate name with language "post".

 alternatenameid | geonameid | isolanguage |                    alternatename                    | ispreferredname | isshortname | iscolloquial | ishistoric
-----------------+-----------+-------------+-----------------------------------------------------+-----------------+-------------+--------------+------------
         2732459 |   4838652 | post        | 06460                                               |                 |             |              |
         3006455 |   4838652 | link        | http://en.wikipedia.org/wiki/Milford%2C_Connecticut |                 |             |              |
         3045158 |   4838652 | ru          | Милфорд                                             |                 |             |              |
         8220181 |   4838652 | kk          | Милфорд                                             |                 |             |              |
         8220182 |   4838652 | uk          | Мілфорд                                             |                 |             |              |
         8220183 |   4838652 | mr          | मिलफोर्ड                                             |                 |             |              |
         8220184 |   4838652 | ko          | 밀퍼드                                              |                 |             |              |
         8220185 |   4838652 | zh          | 米尔福德                                            |                 |             |              |
         8710578 |   4838652 | sr          | Милфорд                                             |                 |             |              |
        10726156 |   4838652 | post        | 06466                                               |                 |             |              |
(10 rows)

Is there a way to include this name during the index building phase?

DeaconDesperado commented 8 years ago

Could this be the relevant filter?

https://github.com/foursquare/twofishes/blob/master/indexer/src/main/scala/importers/geonames/AlternateNamesReader.scala#L59

parkan commented 8 years ago

Postcodes are parsed separately, see https://github.com/foursquare/twofishes/blob/d609e0a37a0765e9a83dc73367c33586b67af681/indexer/src/main/scala/importers/geonames/GeonamesFeature.scala#L59

DeaconDesperado commented 8 years ago

Thanks @parkan!

Do you perhaps know how this is exposed in the responses? I started out looking for it in names since that's how geonames saves them. Is there another key that is dedicated to the postalCodes?

rahulpratapm commented 8 years ago

http://demo.twofishes.net/static/geocoder.html?query=10038

As you can see here, postal codes are just like any other features. There's no dedicated key, although the way we hack ids for these features in the geonamezip namespace allows you to parse out the country and postal code pretty easily. Notice the ids field for the feature above has a single FeatureId object whose source is geonamezip and id is US-10038