gravitystorm / openstreetmap-carto

A general-purpose OpenStreetMap mapnik style, in CartoCSS
Other
1.55k stars 822 forks source link

Ending shop/office catch-all #5014

Open imagico opened 2 months ago

imagico commented 2 months ago

In light of the work of @matkoniecz on cleaning up non-standard shop values i think we should re-consider the catch-all we have for rendering generic dot symbols for shop=* and office=*

Background

For background: We had introduced the catch-all rendering in #2415 and it was already controversial decision back then. Since then we had an attempt in #3730 to remove the catch-all in favor of a script generated list of common values - but that was not merged. We have in #3718 removed support for shop=yes - leading to the inconsistency documented in #4906. Office catch-all was introduced in #3163. In both cases any and all shop=* and office=* values other than those in a short exclusion list (like 'no') are shown with a generic dot and a label just like any other well established value that is not explicitly rendered with a dedicated symbol.

Reasoning

Reasoning for my suggestion to stop this is, that, while the catch-all supports the low-barrier introduction of new values with positive feedback and this way supports mapping a diverse geography, it also has the substantial negative effect of providing positive feedback on mis-taggings like typos and introduction of new synonyms to existing tagging - including misuse of shop=* to map non-shop POIs and make them appear with a dot and label.

As OSM grows more and more, the number of established shop=* and office=* increases (both in numbers and in diversity) while the number of real world shops for which no fitting classification exists yet, that require the invention of a new shop type, decreases. This makes the catch-all less useful with every year coming and going.

The difficulty - and that was already visible in #3730 and #3163 - is, of course, the generation and maintainance of a list of accepted values for the generic dot rendering. We essentially introduced the catch-all as an easy solution to that problem.

Concrete suggestion

My recommendation at this stage is:

Reasoning for the idea to use a fairly extensive list:

Reasoning for the idea of using a database table instead of a shop IN ('foo', ...) or an inline table:

An open question would still be if we should pre-generate the list from taginfo data for every release or if we should require style users to run the script for that similar to get-external-data.py. Considering our very irregular release pattern more recently the latter might make more sense.

matkoniecz commented 2 months ago

Based on my research that you linked (I actually had talk about it on SOTM today) I discovered two problems

I planned to (mostly) solve it before asking to merge #3730 but I keep finding more cases where such change would encourage people to damage data (by retagging to mismatching ones). And I keep finding more and more of cases like this as I continue research.

So actually I am still working on this but this seems nowhere near close to be finished :/

If anyone is interested I can dig out list of cases which are not documented and have no known tagging solution or look like a good tags.

imagico commented 2 months ago

We need to separate between the mapping/tagging issues and the rendering questions. I don't think acute developments in tagging practice should have a significant impact on our rendering decisions here. The problem discussed here existed for many years and we need to look at it with a long term perspective. The need to introduce new shop values is not going to go away, this is going to stay a necessity - which is why the proposed solution aims to allow to handle this dynamically and automatically. The realization that it is probably not possible to produce a hand curated list of supported values without introducing substantial cultural bias is one of the main reasons for me suggesting the approach outlined. I agree that the existence of wiki documentation of a tag is not a good indicator for its suitability to be rendered.

The underlying problem with shops is largely that mappers have very early on decided on using a very fine grained classification in primary tagging. Differences that would otherwise be expressed with secondary tags are with shops usually part of the primary classification. For example every type of shop dedicated at selling a specific kind of region specific food needs a unique primary tag (like shop=tortilla, but also things like shop=olive_oil, shop=kimchi).

Everyone should keep in mind that this is not about which shops are rendered with a dedicated pictorial symbol, this is about the generic shop dots that are displayed for those shops for which we do not have a dedicated rendering.

dch0ph commented 2 months ago

This sounds an excellent way forward. Realistically, scalable solutions to POIs will need some kind of POI database table. I tried something along these lines in my aborted attempt to tackle #3880.

imagico commented 2 months ago

To be clear: the table needed here for the approach proposed would be a simple single column table with the shop values that are considered valid. This would be created and filled by a python script that gets the list of shop values from taginfo and cuts it off at a certain threshold. If we'd use the same table for shops and offices that would mean an additional column - or we could just have two tables.