pelias / labels

Pelias Label generation
https://pelias.io
MIT License
4 stars 9 forks source link

Idea: add more detail to labels based on other results #8

Open orangejulius opened 7 years ago

orangejulius commented 7 years ago

Right now, each result has its label determined only by the data in that record itself. This is easy, consistent, and simple. But sometimes I wonder if it makes sense to add more detail if a lot of results have labels that look similar, so that the results can be distinguished.

Consider this example from Mapillary that came from @eneerhut (thanks!) screenshot from 2016-10-20 14-32-32

It looks at first like we are just bad at deduplicating, but in fact there are that many Zacatecas in the world: screenshot from 2016-10-20 14-30-29

Even worse, some are regions, some are localities, some are even neighbourhoods, but you can't tell from the labels. Obviously, there's some leeway in the UI of things like Mapzen.js to add a little bit more information to what's displayed (showing the layer might help here, for example), but perhaps our labels could help.

This could also be useful for cases like in https://github.com/pelias/pelias/issues/317, where searches for common restaurant chains gives back lots of identical results.

eneerhut commented 7 years ago

Thanks for the explanation @orangejulius. As you say, it's really a UI thing and displaying admin level that would make search a lot more intuitive.

orangejulius commented 7 years ago

UI changes could definitely help (I know i've seen a nice geocoding results dropdown somewhere that had a very nice, subtle, right justified section that showed what type of result it was, but I can't find it now), but we can and will continue to improve labels too :)

hannesj commented 7 years ago

Ping @vesameskanen, any ideas from https://github.com/HSLdevcom/pelias-api/pull/26 or https://github.com/HSLdevcom/pelias-api/pull/30

vesameskanen commented 7 years ago

Yes, indeed:) The problems becomes very obvious when you search a generic venue name such as 'k-market, Helsinki' (a shop chain).

Briefly, we implemented a new postprocessing layer on top of label generator. It iterates a set of configured 'name expanders' until all labels have become unique or all expanders have been executed. The iteration handles each group of identical names separately in order to ensure that just the minimal amount of expansion takes place. Looooong names are not especially desirable,

We currently apply 4 expanders, in this order:

  1. Add street address to venue name
  2. Add configured admin levels to name, if not already included (e.g. neighbourhood)
  3. Add geographic qualificaton like 'north' or 'southwest'
  4. Add category information

We also implemented a 5th expander which adds the layer type (localadmin, region etc.), but that is currently unused as our UI shows the layer type using graphical icon.

Expansions 1. and 2. work especially nicely for the problematic 'k-market' search. Name + street address is a very natural way to identify the items.