mozilla-mobile / prox-server

[INACTIVE] Server & data scripts for the Prox client.
https://github.com/mozilla-mobile/prox/
Mozilla Public License 2.0
5 stars 5 forks source link

Match yelp ID to wikipedia pages #93

Closed mcomella closed 7 years ago

mcomella commented 7 years ago

Fuzzy name matching? Note: unlike the other APIs, there is less of a guarantee that a place will have an associated wikipedia page (e.g. restaurants).

mcomella commented 7 years ago

Prox shows information from all existing sources - Yelp, TripAdvisor and Wikipedia

mcomella commented 7 years ago

Current status: I can transform (place_name, coord) -> Wikipedia ID for:

Some examples that we can't do right now:

  1. Ferry Building Marketplace -> San Francisco Ferry Building
  2. InterContinental Mark Hopkins -> Mark Hopkins Hotel
  3. Tonga Room & Hurricane Bar -> Tonga Room
  4. The Scarlet Huntington -> Huntington Hotel (San Francisco)

I think the case in 1 is do-able. 2 & 3 seems do-able if we're okay with more false matches (e.g. by using partial_token matches). 4 may not be feasible.

mcomella commented 7 years ago

Testing my algo ^ on 29 wikipedia pages in SF with Yelp pages, we get 86% (25/29) places and Factual gets 48% (14/29), all of which we have.

I think this may fall into "good enough" territory.

mcomella commented 7 years ago

100. Not sure it's good enough but we can file a followup.