Closed missinglink closed 8 years ago
Additionally there are issues with addresses in autocomplete:
@randymeech reported:
I started the in eastern Long Island (on the move), searched for my address. Still took too long to find with many, many far-flung results in autocomplete given that I was 100 miles away from Brooklyn.
@souperneon also reported:
I have the same problem with search every time. The address is there and I can find it if I am within (I'm guestimating) 10-15miles of it. But I can't find it even if I type in the full exact address if I am further than that.
Thanks for adding this one in @dianashk I'm not sure if this is related, but autocomplete stops autocompleting after a point and just shows results from Africa?
@souperneon Could you send an example?
EXAMPLE: 55 Stratford Av Greenlawn, New York (Focus point from Queens, NY)
55 Stratford Av
1) 55 Stratford Avenue, Huntington, NY :white_check_mark: 2) 5 Stratford Road, Brooklyn, NY 3) 5 Stratford Road, Brooklyn, NY 4) 2 Stratford Road, Brooklyn, NY 5) 2 Stratford Road, Brooklyn, NY 6) 7 Stratford Avenue, Staten Island, NY 7) 7 Stratford Avenue, Staten Island, NY 8) 6 Stratford Avenue, Staten Island, NY 9) 6 Stratford Avenue, Staten Island, NY 10) 4 Stratford Avenue, Staten Island, NY
{
"type": "Feature",
"properties": {
"id": "de032ef29a8841beaa6cbf0a82a95cc1",
"gid": "oa:address:de032ef29a8841beaa6cbf0a82a95cc1",
"layer": "address",
"source": "oa",
"name": "55 Stratford Avenue",
"housenumber": "55",
"street": "Stratford Avenue",
"postalcode": "11740",
"country_a": "USA",
"country": "United States",
"region": "New York",
"region_a": "NY",
"county": "Suffolk County",
"localadmin": "Huntington",
"locality": "Greenlawn",
"neighbourhood": "Little Plains",
"confidence": 0.882,
"distance": 49.695,
"label": "55 Stratford Avenue, Huntington, NY"
},
"geometry": {
"type": "Point",
"coordinates": [
-73.34405,
40.868223
]
}
},
55 Stratford Av Greenlawn
1) 55 Greenlawn Terrace, Babylon, NY 2) Greenlawn Station, Huntington, NY 3) 5 Stratford Road, Brooklyn, NY 4) 5 Stratford Road, Brooklyn, NY 5) 2 Stratford Road, Brooklyn, NY 6) 2 Stratford Road, Brooklyn, NY 7) 7 Stratford Avenue, Staten Island, NY 8) 7 Stratford Avenue, Staten Island, NY 9) 6 Stratford Avenue, Staten Island, NY 10) 6 Stratford Avenue, Staten Island, NY
55 Stratford Av Greenlawn NY
1) 5 Stratford Road, Brooklyn, NY 2) 5 Stratford Road, Brooklyn, NY 3) 2 Stratford Road, Brooklyn, NY 4) 2 Stratford Road, Brooklyn, NY 5) 55 Greenlawn Terrace, Babylon, NY 6) Greenlawn Station, Huntington, NY 7) 7 Stratford Avenue, Staten Island, NY 8) 7 Stratford Avenue, Staten Island, NY 9) 6 Stratford Avenue, Staten Island, NY 10) 6 Stratford Avenue, Staten Island, NY
55 Stratford Av Greenlawn New York
1) New York County, NY 2) Greater New York Academy, Queens, NY 3) Greater New York Academy, Queens, NY 4) Gracie Station New York Post Office, Manhattan, NY 5) Gracie Station New York Post Office, Manhattan, NY 6) New York Structural Biology Center, Manhattan, NY 7) New York Structural Biology Center, Manhattan, NY 8) Hamilton Grange Station New York Post Office, Manhattan, NY 9) Hamilton Grange Station New York Post Office, Manhattan, NY 10) Grand Hyatt New York, Manhattan, NY
Appears to be a few things in play:
huntington
into account either)Search is successfully completing this query.
pelias/pelias#45 has some ideas of how to handle this, but generally we need a strong definition of what the autocomplete balance should look like.
Closing pelias/pelias#45 but look there for additional notes.
Going to take a stab at defining what we're looking for here, before moving it up and into the wiki:
There are a few behaviors we're looking to model:
We can't talk about one without the other. And it's super important to talk about this in the context of the deduplication work [#339], which should limit cruft in the results.
TKTK
This is something we'll continue to argue about and should come from a further conversation w/ our users. It's also challenging to define given focus bias points.
Generally, we should be aware of some of the content of the queries as they're coming in, allowing the detection of a leading numeric (likely then an address) or letters (likely then a locality, a POI, a region, or a street).
We know that users will often continue to type their query, even if it's in the result list already. We all do this. Perhaps it's to drive the matching place higher in the results, or because the user hasn't realized that their place is there (or haven't yet processed it visually), or because they're conditioned to keep typing and then hit "Search". They keep typing. Perhaps this is part of why FST's work well for autocomplete (when they can be used).
Examples:
When a user begins typing, their intended place should (eventually) show up in the top 5 places of the result list. Once it does show up, as the user specifies further, the place should continue to match, and its placement should increase, eventually bringing it toward the top of the results (unless there are other, equally scoring places).
QUESTION: How should focus.point
affect this? At what point should a linguistic match overwhelm closeness? And how should distance from the point affect ranking?
Our expansion / compression logic will street
-> st
, meaning str
-> str
, stre
-> stre
and the additional scoring boost seen by the user before won't be expressed.
for the sake of brevity, if there are more issues regarding autocomplete
(i'm sure there are) could you please file a seperate ticket and add the orange autocomplete
label: https://github.com/pelias/api/labels/autocomplete
thanks! this will make it easier to read everyone's feedback, categorise & prioritise better, and start fixing it!
Update on this issue:
/v1/autocomplete?focus.point.lat=40.74686681162143&focus.point.lon=-73.98983001708986&text=katzs deli
1) Katz's Delicatessen, Manhattan, NY, USA
2) Katz Deli, Fort Loudon, TN, USA
3) Katz Deli, Montréal, Canada
4) Katz's Deli Express, Shenandoah, TX, USA
5) Katz's New York Deli, Houston, TX, USA
^ this seems to be a massive improvement
leaving this open for now as I'm pretty confident all these issues are already being addressed in other tickets, I will use these cases for the regression test suite once the work is complete.
Moving ticket to 'in review'
all the issues noted above have been resolved in the production environment, at time of writing:
/v1/autocomplete?focus.point.lat=40.74686681162143&focus.point.lon=-73.98983001708986&text=katzs deli
1) Katz's Delicatessen, Manhattan, New York, NY, USA
2) Katz Deli, TN, USA
3) Katz Deli, Montréal, Quebec, Canada
4) Katz's Deli Express, Shenandoah, TX, USA
5) Katz's New York Deli, Houston, TX, USA
/v1/autocomplete?focus.point.lat=40.769073&focus.point.lon=-73.918458&text=55 stratford av greenlawn new york
1) 55 Stratford Avenue, Greenlawn, NY, USA
2) 55 Stratford Avenue, Pittsfield, MA, USA
the "katz diner" issue is covered by an existing acceptance test and the "stratford av" functionality is covered in other existing tests in autocomplete_streets.json
@missinglink \o/
this is a good example for tweaking the balance between linguistics vs. geography vs. exact matches for
focus.point
:/v1/autocomplete?focus.point.lat=40.74686681162143&focus.point.lon=-73.98983001708986&text=katzs deli
the query is for the famous http://katzsdelicatessen.com/ and the focus is on NYC
for this query
/autocomplete
returns the following for"katzs deli"
:.. but when fully specifying the name as
"katzs delicatessen"
then we get the correct item first:it would be a good idea to play with this linquistics/geography balance and the boost applied to exact matches in order to try and get this better, at the same time trying to avoid break other behaviour...
the admin boosting is also playing a part here, maybe we can discuss in a project meeting because there are lots of different analysis playing a part in what gets returned here.