e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Enhance the suggestion system to gather instant user feedback #382

Open shankari opened 5 years ago

shankari commented 5 years ago

We would like the following behavior to allow users to get feedback on their suggestions:

shankari commented 5 years ago

First step for this change was to return a list of potential businesses that users can choose from instead of a single location. @valin1 attempted to do this in https://github.com/e-mission/e-mission-server/pull/657 but that does not appear to be sufficient since the primary call is to nominatim, which returns a single reverse geocoded address, and the radius is only for the google API, which is only called if the nominatim call fails.

shankari commented 5 years ago

It seems like what we really want to do is to use the yelp business search API or the overpass.de POI query to get the list of candidate locations instead of using a reverse geocode, which will really only return one value. The corresponding yelp query is: https://www.yelp.com/developers/documentation/v3/business_search

shankari commented 5 years ago

Let's do a concrete test of this. 37.7019950, -122.4706021 is the mall complex near Daly City BART with a bunch of restaurants and a movie theatre in the same building.

This is what return_address_from_location_nominatim return

In [1]: import emission.core.wrapper.suggestion_sys as sugg
Connecting to database URL localhost

In [2]: sugg.return_address_from_location_nominatim(37.7019950, -122.4706021)
Out[2]:
('Junipero Serra Boulevard, Westlake, Daly City, San Mateo County, 94014, United States of America',
 {'country': 'United States of America',
  'country_code': 'us',
  'county': 'San Mateo County',
  'neighbourhood': 'Westlake',
  'postcode': '94014',
  'road': 'Junipero Serra Boulevard',
  'town': 'Daly City'})

Testing further with the code from https://github.com/e-mission/e-mission-server/pull/657

In [4]: string_address, address_dict = sugg.return_address_from_location_nominatim(37.70
   ...: 19950, -122.4706021)

In [5]: business_key = list(address_dict.keys())[0]

In [6]: business_name = address_dict[business_key]

In [7]: (business_key, business_name)
Out[7]: ('road', 'Junipero Serra Boulevard')

In [8]: candidates = [(business_name, business_name.replace(' ', '-'))]

In [9]: candidates
Out[9]: [('Junipero Serra Boulevard', 'Junipero-Serra-Boulevard')]

I really don't see how the nominatim search is even helpful at this point

shankari commented 5 years ago

ah, the nominatim configuration also needs to be enhanced, and the sample file has not been updated

shankari commented 5 years ago

But even after updating the sample file, I get the same result. @valin1, I see that you have added parameters for zoom level to the building level which will actually return a business (e.g. https://nominatim.openstreetmap.org/reverse.php?format=html&lat=37.701995&lon=-122.4706021&zoom=18).

However, I don't see any evidence that you are actually using this parameter. I know you test via an ipython notebook - can you post an example here of the return_address_from_location_nominatim function actually returning a business name?

shankari commented 5 years ago

ok, but even if we did use nominatim with the correct zoom level, it looks like our custom nominatim server does not return the building name.

In [7]: requests.get("http://54.198.41.236/nominatim/reverse?lat=37.701995&lon=-122.4706
   ...: 021&format=json&zoom=18&addressdetails=1").json()
Out[7]:
{'address': {'country': 'United States of America',
  'country_code': 'us',
  'county': 'San Mateo County',
  'neighbourhood': 'Westlake',
  'postcode': '94014',
  'road': 'Junipero Serra Boulevard',
  'town': 'Daly City'},
 'display_name': 'Junipero Serra Boulevard, Westlake, Daly City, San Mateo County, 94014, United States of America',
 'lat': '37.7008995',
 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright',
 'lon': '-122.4707415',
 'osm_id': '183198747',
 'osm_type': 'way',
 'place_id': '525262'}

although the classic nominatim server does.

In [6]: requests.get("https://nominatim.openstreetmap.org/reverse?lat=37.701995&lon=-122
   ...: .4706021&format=json&zoom=18&addressdetails=1").json()
Out[6]:
{'address': {'building': 'Genesys',
  'city': 'Daly City',
  'country': 'USA',
  'country_code': 'us',
  'county': 'San Mateo County',
  'neighbourhood': 'Westlake',
  'postcode': '94015',
  'road': 'Junipero Serra Boulevard',
  'state': 'California'},
 'boundingbox': ['37.7009805', '37.7015647', '-122.4705701', '-122.4697247'],
 'display_name': 'Genesys, Junipero Serra Boulevard, Westlake, Daly City, San Mateo County, California, 94015, USA',
 'lat': '37.70127',
 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
 'lon': '-122.470112323229',
 'osm_id': 33967913,
 'osm_type': 'way',
 'place_id': 84517625}
shankari commented 5 years ago

Argh used the wrong issue while committing the other change.

shankari commented 5 years ago

The real issue for https://github.com/e-mission/e-mission-server/commit/88fef81582d7706d22525d0beefcba4149441f0a is https://github.com/e-mission/e-mission-docs/issues/339 in case anybody follows the trail and gets here.

shankari commented 5 years ago

At any rate, even the enhanced nominatim is not going to give us a list of potential businesses, so we would need to use either yelp, google places or overpass.de. I vote that we go with yelp since we are already using them for the category + related business search and consistency will help accuracy.

Testing out the same location for the yelp API...

In [12]: sugg.request(sugg.API_HOST, sugg.SEARCH_PATH, sugg.YELP_API_KEY, url_params={
    ...: 'latitude': 37.701995, 'longitude': -122.4706021, 'radius': 100, 'limit': 50})
Out[12]:
{'businesses': [{'alias': 'tomo-sushi-and-teriyaki-daly-city',
   'categories': [{'alias': 'japanese', 'title': 'Japanese'},
    {'alias': 'sushi', 'title': 'Sushi Bars'}],
   'coordinates': {'latitude': 37.7020804132731,
    'longitude': -122.470485788361},
   'display_phone': '(650) 991-1045',
   'distance': 13.961129730970354,
   'id': '0K6O21FH30efT_nDB_ALIg',
   'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/F3yzCKOgOdKX6rVxM9OWxA/o.jpg',
   'is_closed': False,
   'location': {'address1': '1901 Junipero Serra Blvd',
    'address2': '',
    'address3': '',
    'city': 'Daly City',
    'country': 'US',
    'display_address': ['1901 Junipero Serra Blvd', 'Daly City, CA 94014'],
    'state': 'CA',
    'zip_code': '94014'},
   'name': 'Tomo Sushi & Teriyaki',
   'phone': '+16509911045',
   'price': '$$',
   'rating': 3.5,
   'review_count': 480,
   'transactions': [],
   'url': 'https://www.yelp.com/biz/tomo-sushi-and-teriyaki-daly-city?adjust_creative=TJdz5Lnp0Vk5NdomBW6G1g&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=TJdz5Lnp0Vk5NdomBW6G1g'},
....

and more succinctly. Note that this also illustrates why the hack of taking the name and adding dashes, or trying to manually add -berkeley will break.

In [15]: for b in yelp_results['businesses']:
    ...:     print(b['alias'], b['name'])
    ...:
tomo-sushi-and-teriyaki-daly-city Tomo Sushi & Teriyaki
cold-stone-creamery-daly-city Cold Stone Creamery
westlake-coffee-shop-daly-city-2 Westlake Coffee Shop
round-table-pizza-daly-city Round Table Pizza
subway-daly-city-13 Subway
shankari commented 5 years ago

While waiting for @valin1 to comment, I will attempt to work on adding at least the thumbs up/down buttons.

shankari commented 5 years ago

Some testing of the google APIs just for my own edification. lat/lng are the same as before. It looks like we first look up the address, and then we look up nearby businesses, and then we say that the business is a match if the address matches. But as we can see from the example below, the address can be formatted differently. @valin1 is there a reason you picked this approach here?

In [25]: nm.return_address_from_location_google("37.7019950", "-122.4706021")
DEBUG:root:About to query google with URL https://maps.googleapis.com/maps/api/geocode/json?latlng=37.7019950,-122.4706021&key=<redacted>
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): maps.googleapis.com
DEBUG:requests.packages.urllib3.connectionpool:https://maps.googleapis.com:443 "GET /maps/api/geocode/json?latlng=37.7019950,-122.4706021&key=<redacted> HTTP/1.1" 200 1450
DEBUG:root:Got result = [{'long_name': '1901G', 'short_name': '1901G', 'types': ['street_number']}, {'long_name': 'Junipero Serra Boulevard', 'short_name': 'Junipero Serra Blvd', 'types': ['route']}, {'long_name': 'Daly City', 'short_name': 'Daly City', 'types': ['locality', 'political']}, {'long_name': 'San Mateo County', 'short_name': 'San Mateo County', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'California', 'short_name': 'CA', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'United States', 'short_name': 'US', 'types': ['country', 'political']}, {'long_name': '94014', 'short_name': '94014', 'types': ['postal_code']}]
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): maps.googleapis.com
DEBUG:requests.packages.urllib3.connectionpool:https://maps.googleapis.com:443 "GET /maps/api/place/nearbysearch/json?location=37.7019950,-122.4706021&radius=10&key=<redacted> HTTP/1.1" 200 1227
DEBUG:root:For amenity Daly City, comparing address 1901G Junipero Serra Boulevard, San Mateo County with nearby business Daly City
DEBUG:root:For amenity Tomo Sushi & Teriyaki, comparing address 1901G Junipero Serra Boulevard, San Mateo County with nearby business 1901 Junipero Serra Boulevard # G, Daly City
DEBUG:root:After checking = 1901G Junipero Serra Boulevard, San Mateo County, got business tuple (False, '')
Out[25]: ('1901G Junipero Serra Blvd, San Mateo County', True)
shankari commented 5 years ago

There appears to be some serious issue with the Mountain View Kohl's. Even if I increase the radius to 300m, the Kohl's doesn't show up

In [27]: yelp_results = sugg.request(sugg.API_HOST, sugg.SEARCH_PATH, sugg.YELP_API_KEY,
    ...:
    ...: url_params={'latitude':  37.4035500, 'longitude': -122.1078300,
    ...: 'radius': 300, 'limit': 50})

In [28]: for b in yelp_results['businesses']:
    ...:     print(b['alias'], b['name'], b['distance'])
    ...:
hunan-homes-express-mountain-view Hunan Home's Express 164.1491100942954
sushi-88-and-ramen-mountain-view Sushi 88 & Ramen 120.02196802266633
chilly-and-munch-mountain-view Chilly & Munch 370.20857538011705
pearl-cafe-mountain-view Pearl Cafe 122.52780815547474
momo-grill-truck-sunnyvale-5 Momo Grill Truck 210.62506667833333
mamacitas-tacos-mountain-view Mamacitas Tacos 304.1172989607118
luu-noodle-mountain-view Luu Noodle 85.33709297359775
super-tacos-el-conrro-mountain-view-5 Super Tacos El Conrro 104.89493301135538
tacos-la-oaxaqueña-mountain-view-2 Tacos La Oaxaqueña 222.89717771462585
mcdonalds-mountain-view-8 McDonald's 343.210373246048
lobby-lounge-mountain-view Lobby Lounge 244.86705428219472

Although locations much further than the boundaries of Kohl's do show up

mcdonalds-mountain-view-8 chilly-and-munch-mountain-view
Screen Shot 2019-04-19 at 7 06 38 PM Screen Shot 2019-04-19 at 7 07 13 PM
Screen Shot 2019-04-19 at 7 10 25 PM
shankari commented 5 years ago

This may just be due to the type of business. I can similarly not see Target

Screen Shot 2019-04-19 at 7 18 09 PM
In [33]: yelp_results = sugg.request(sugg.API_HOST, sugg.SEARCH_PATH, sugg.YELP_API_KEY,
    ...:
    ...: url_params={'latitude':  37.3733609, 'longitude': -122.0324436,
    ...: 'radius': 300, 'limit': 50})

In [34]: for b in yelp_results['businesses']:
    ...:     print(b['alias'], b['name'], b['distance'])
    ...:
dishdash-sunnyvale DishDash 374.7813325293151
nom-burger-sunnyvale Nom Burger 346.86439589436856
ramen-seas-sunnyvale Ramen Seas 393.6565136942681
sugar-mama-desserts-sunnyvale-2 Sugar Mama Desserts 364.59503735759137
phoever-sunnyvale PhoEver 368.6893108855658
the-oxford-sunnyvale The Oxford 358.14422488144976
vino-vino-sunnyvale-3 Vino Vino 349.85403973817836
bean-scene-sunnyvale Bean Scene 381.5131123242474
robertos-cantina-sunnyvale-2 Roberto's Cantina 415.79290275897154
rokko-fine-japanese-cuisine-sunnyvale-2 Rokko Fine Japanese Cuisine 355.28298359979146
bambu-sunnyvale-2 BAMBU 343.27907224409205
inchins-bamboo-garden-sunnyvale-3 Inchin's Bamboo Garden 343.683756211071
lilly-macs-irish-bar-and-restaurant-sunnyvale Lilly Mac's Irish Bar And Restaurant 371.72816899734283
tao-tao-cafe-sunnyvale Tao Tao Cafe 388.15660839915495
gumbas-italian-restaurant-sunnyvale-2 Gumba's Italian Restaurant 401.40407699993585
fibbar-magees-sunnyvale-4 Fibbar Magees 427.77480202154396
tapt-beer-and-kitchen-sunnyvale TAP'T Beer & Kitchen 361.6204640079871
poki-bowl-sunnyvale-2 Poki Bowl 350.76391263236434
siam-taste-sunnyvale Siam Taste 407.6671234669591
the-prolific-oven-sunnyvale-3 The Prolific Oven 350.1750007482006
fashion-wok-sunnyvale Fashion Wok 413.1709988168766
starbucks-sunnyvale-30 Starbucks 346.04166874111087
starbucks-sunnyvale-33 Starbucks 99.76120492260742
oaxacan-kitchen-pop-up-sunnyvale Oaxacan Kitchen Pop-Up 365.4068373708529
cafe-at-nokia-sunnyvale Cafe at Nokia 342.9279683035334
pizza-hut-sunnyvale-3 Pizza Hut 41.8329055628198
shankari commented 5 years ago

All the categories appear to be restaurants. I think we may need to expand the search query. But the dataset will tell us. @b-i-l-l-c-a-o please make sure to include a range of business types.

In [35]: for b in yelp_results['businesses']:
    ...:     print(b['alias'], b['name'], b['distance'], [c['alias'] for c in b['categor
    ...: ies']])
    ...:
dishdash-sunnyvale DishDash 374.7813325293151 ['mideastern', 'mediterranean']
nom-burger-sunnyvale Nom Burger 346.86439589436856 ['newamerican', 'burgers']
ramen-seas-sunnyvale Ramen Seas 393.6565136942681 ['ramen']
sugar-mama-desserts-sunnyvale-2 Sugar Mama Desserts 364.59503735759137 ['icecream', 'catering']
phoever-sunnyvale PhoEver 368.6893108855658 ['vietnamese', 'asianfusion', 'sandwiches']
the-oxford-sunnyvale The Oxford 358.14422488144976 ['gastropubs', 'newamerican', 'british']
vino-vino-sunnyvale-3 Vino Vino 349.85403973817836 ['wine_bars', 'newamerican']
bean-scene-sunnyvale Bean Scene 381.5131123242474 ['coffee']
robertos-cantina-sunnyvale-2 Roberto's Cantina 415.79290275897154 ['mexican', 'bars']
rokko-fine-japanese-cuisine-sunnyvale-2 Rokko Fine Japanese Cuisine 355.28298359979146 ['japanese', 'sushi', 'seafood']
bambu-sunnyvale-2 BAMBU 343.27907224409205 ['coffee', 'juicebars', 'bubbletea']
inchins-bamboo-garden-sunnyvale-3 Inchin's Bamboo Garden 343.683756211071 ['asianfusion', 'chinese', 'indpak']
lilly-macs-irish-bar-and-restaurant-sunnyvale Lilly Mac's Irish Bar And Restaurant 371.72816899734283 ['pubs', 'irish']
tao-tao-cafe-sunnyvale Tao Tao Cafe 388.15660839915495 ['chinese']
gumbas-italian-restaurant-sunnyvale-2 Gumba's Italian Restaurant 401.40407699993585 ['pizza', 'italian', 'seafood']
fibbar-magees-sunnyvale-4 Fibbar Magees 427.77480202154396 ['irish_pubs', 'sportsbars']
tapt-beer-and-kitchen-sunnyvale TAP'T Beer & Kitchen 361.6204640079871 ['newamerican', 'beerbar', 'breakfast_brunch']
poki-bowl-sunnyvale-2 Poki Bowl 350.76391263236434 ['asianfusion', 'seafood']
siam-taste-sunnyvale Siam Taste 407.6671234669591 ['thai']
the-prolific-oven-sunnyvale-3 The Prolific Oven 350.1750007482006 ['bakeries', 'cafes', 'breakfast_brunch']
fashion-wok-sunnyvale Fashion Wok 413.1709988168766 ['asianfusion', 'hotpot', 'chinese']
starbucks-sunnyvale-30 Starbucks 346.04166874111087 ['coffee']
starbucks-sunnyvale-33 Starbucks 99.76120492260742 ['coffee']
oaxacan-kitchen-pop-up-sunnyvale Oaxacan Kitchen Pop-Up 365.4068373708529 ['mexican', 'foodstands']
cafe-at-nokia-sunnyvale Cafe at Nokia 342.9279683035334 ['restaurants']
pizza-hut-sunnyvale-3 Pizza Hut 41.8329055628198 ['hotdogs']
shankari commented 5 years ago

Yeah I'm pretty sure that's what's going on. Even if I use the exact lat/lng for "NK Trends" https://www.openstreetmap.org/#map=18/37.37630/-122.03119&layers=D it doesn't show up in the results

In [38]: yelp_results = sugg.request(sugg.API_HOST, sugg.SEARCH_PATH, sugg.YELP_API_KEY,
    ...:
    ...: url_params={'latitude':  37.3762979, 'longitude': -122.0311916,
    ...: 'radius': 100, 'limit': 50})

In [39]: for b in yelp_results['businesses']:
    ...:     print(b['alias'], b['name'], b['distance'], [c['alias'] for c in b['categor
    ...: ies']])
    ...:
ramen-seas-sunnyvale Ramen Seas 63.61210457074628 ['ramen']
nom-burger-sunnyvale Nom Burger 63.68859925925443 ['newamerican', 'burgers']
philz-coffee-sunnyvale-3 Philz Coffee 116.44702105351041 ['coffee']
the-oxford-sunnyvale The Oxford 51.872312029747576 ['gastropubs', 'newamerican', 'british']
rokko-fine-japanese-cuisine-sunnyvale-2 Rokko Fine Japanese Cuisine 15.9920412818948 ['japanese', 'sushi', 'seafood']
inchins-bamboo-garden-sunnyvale-3 Inchin's Bamboo Garden 10.630814521030361 ['asianfusion', 'chinese', 'indpak']
bambu-sunnyvale-2 BAMBU 12.064624616834944 ['coffee', 'juicebars', 'bubbletea']
metro-city-restaurant-and-bar-sunnyvale Metro City Restaurant & Bar 93.75000721237014 ['tradamerican', 'breakfast_brunch', 'bars']
vino-vino-sunnyvale-3 Vino Vino 67.34484951206109 ['wine_bars', 'newamerican']
poki-bowl-sunnyvale-2 Poki Bowl 30.847324269162886 ['asianfusion', 'seafood']
lilly-macs-irish-bar-and-restaurant-sunnyvale Lilly Mac's Irish Bar And Restaurant 50.13365082676116 ['pubs', 'irish']
k-tea-cafe-sunnyvale K Tea Cafe 106.68796686001566 ['bubbletea', 'cafes', 'creperies']
tao-tao-cafe-sunnyvale Tao Tao Cafe 71.66573085116174 ['chinese']
sajj-mediterranean-sunnyvale Sajj Mediterranean 122.79608688742314 ['falafel']
taverna-bistro-sunnyvale Taverna Bistro 122.16745104183407 ['mediterranean', 'turkish', 'hookah_bars']
river-rock-taproom-sunnyvale River Rock Taproom 86.9823860742604 ['beerbar', 'tradamerican']
fashion-wok-sunnyvale Fashion Wok 87.34403650115478 ['asianfusion', 'hotpot', 'chinese']
meyhouse-sunnyvale Meyhouse 122.16745104183407 ['turkish', 'mediterranean']
starbucks-sunnyvale-30 Starbucks 40.888130722202504 ['coffee']
the-dons-deli-sunnyvale The Don's Deli 124.5771902664676 ['delis']
murphys-law-sunnyvale Murphy's Law 119.08098172740476 ['irish_pubs', 'sportsbars']
shankari commented 5 years ago

After merging all the PRs to create the dataset, the category results are

DEBUG:root:After instance O'hair Park, successfulTests = 2, failedTests = 48
INFO:root:Test complete, overall accuracy = 4.0

The two successful tests are two residential locations added by @b-i-l-l-c-a-o (I didn't see any residential locations from the others), which are really not in yelp (unlike high school!) and so don't find any matching categories.

DEBUG:root:After instance T Pumps, successfulTests = 0, failedTests = 44
DEBUG:root:-----Martinez Commons------
...
DEBUG:root:After instance Martinez Commons, successfulTests = 1, failedTests = 44
DEBUG:root:-----Richmond District House------
...
DEBUG:root:After instance Richmond District House, successfulTests = 2, failedTests = 44