mapnik / mapnik

Mapnik is an open source toolkit for developing mapping applications
http://mapnik.org
GNU Lesser General Public License v2.1
3.69k stars 826 forks source link

Label placement algorithm improvement when centroid is outside polygon #3550

Closed Penegal closed 6 years ago

Penegal commented 8 years ago

Hello, there.

I found what seems to be a misplaced label: https://www.openstreetmap.org/way/448092558

As you can see, this forest polygon has a very unusual shape, and that causes a problem in the rendering of its name: it's not centered on it, even roughly, but on the far left of the polygon. Even if it can't be displayed on the polygon centroid, it should be a lot closer of it than it is currently.

I've been told on the openstreetmap-carto repo that this is due to the Mapnik algorithm for rendering names, which renders them on the far left or right when the polygon centroid isn't in the polygon. I suggest the following algorithm:

  1. find the centroid;
  2. if it is inside the polygon, write label here; if outside, try slightly more on the left, then slightly more on the right, and keep trying while pushing away. It could also try up and down before going farther from the centroid

This way, the label would be the closest possible of the centroid, at the cost of complexifying the algorithm; that would improve rendering in such corner cases, though, and the 2., AFAIK, won't be executed on the vast majority of polygons, only on corner cases.

Regards.

Edit: could be linked to #3539 and #2078, but I don't know Mapnik enough to be sure.

sommerluk commented 8 years ago

Maybe related to #2762 which seems to be not totally fixed. Cross-reference https://github.com/gravitystorm/openstreetmap-carto/issues/1465

HolgerJeromin commented 7 years ago

You don't need a complex polygon for having the label wrong:

https://github.com/gravitystorm/openstreetmap-carto/issues/2511

kocio-pl commented 7 years ago

I suggest the following algorithm:

  • find the centroid;
  • if it is inside the polygon, write label here; if outside, try slightly more on the left, then slightly more on the right, and keep trying while pushing away. It could also try up and down before going farther from the centroid

It might be also good to make sure that we never leave an outside bounding box (if the label is smaller, of course).

kocio-pl commented 7 years ago

I guess this where label position for polygons is defined (but I may be wrong - I'm not a programmer):

https://github.com/mapnik/mapnik/blob/master/src/text/symbolizer_helpers.cpp#L289

kocio-pl commented 7 years ago

This code looks even closer to what we're looking for:

https://github.com/mapnik/mapnik/blob/master/include/mapnik/geom_util.hpp#L524

talaj commented 7 years ago

Some evidently wrong placements were fixed by https://github.com/mapnik/mapnik/pull/3771, but Bois du Four is still misplaced. In this case it's really a weakness of the algorithm itself.

The algorithm is simple, it's based on centroid algorithm: A horizontal line that crosses centroid is constructed across the polygon. Interior is the center of the widest intersection between the horizontal line and the polygon.

Bois du Four in visual tests: https://github.com/mapnik/test-data-visual/blob/df8e7b64eb1877ed7c99641769d558c4b3f06d6b/images/interior-point-5-512-512-1.0-agg-reference.png

interior

kocio-pl commented 7 years ago

It's probably the most common Mapnik-related problem we have in osm-carto, so I'd like to ask some questions how to further improve it:

  1. If the algorithm is weak, is there a chance to replace it with something more efficient?
  2. Could it be tweaked, for example:
    • test also for vertical line and choose:
    • a wider one
    • one with a bigger weighted_sum= a*x + b*y (we prefer horizontal lines, but bigger area might be more interesting)
    • test if the resulting placement is inside the complex polygon (not in a hole for example) - if not try some brute force approach:
    • middle of the bounding box
    • divide the bounding box into matrix like 3*3 and test the middle of such boxes
    • make sure that the whole text box will be inside the polygon (not just a point)
kocio-pl commented 7 years ago

By the way: how could I easily test current Mapnik code with Kosmtik - probably not messing with my existing system?

talaj commented 7 years ago

https://github.com/mapbox/polylabel looks promising.

talaj commented 7 years ago

test also for vertical line

I think that taking into account also vertical line has very good efficiency/(amount of work) ratio. It will also keep high performance.

pnorman commented 7 years ago

https://github.com/mapbox/polylabel looks promising.

Yep, that's PIA. I'd probably use it as a reference implementation and add the necessary code to Mapnik. Adding it as a dependency would not work well for most people.

make sure that the whole text box will be inside the polygon (not just a point)

My recollection is that Mapnik derives label points and what is being used to label it (picture, small text, large text, etc) does not matter for the algorithm. PIA is likely to be an improvement here since it will tend to find areas with more room for labeling.

kocio-pl commented 7 years ago

Another possible improvement - add the score for the locations closer to the middle of the bounding box. This could help with typical shapes, like this triangle:

https://www.openstreetmap.org/relation/2758761#map=10/63.3275/33.7445

pnorman commented 7 years ago

Another possible improvement - add the score for the locations closer to the middle of the bounding box. This could help with typical shapes, like this triangle:

Ideally we'd want any placement algorithm to be invariant under rotations. This means that rotating the polygon and labeling would be the same as labeling the polygon and then rotating the polygon and label.

Anything that relies on the bbox will not have this property.

I also tend to be suspect of algorithms involving the bbox, because the properties of a bbox can change so much with a rotation.

kocio-pl commented 7 years ago

I prefer Perfect Information Algorithm of course, but the performance might be an issue. If both have some merits (probably one is much more accurate, while the other one is much faster), it could be an option to choose the algorithm for different kind of objects (or scale) instead of just interior for all the polygons.

@talaj Do you plan to work on this issue further or it's just the general idea gathering currently?

[UPDATE:] Just to clarify discussion - I guess PIA meant here "Poles of Inaccessibility".

talaj commented 7 years ago

Do you plan to work on this issue further or it's just the general idea gathering currently?

@kocio-pl I would like to work on it, but I'm uncertain which way to go.

If we cannot afford performance regression, there is no way we can use polylabel instead of the current weak but O(N) algorithm.

There is possibility to use it as a new distinct placement method and as a benchmark for tinkering current interior algorithm, as @pnorman suggests. Possible downside is that it would expose implementation specific information to the interface. Polylabel has also the precision parameter. We would have to expose it or calculate it somehow.

kocio-pl commented 7 years ago

For me it looks like you can go both ways. :smile:

Polylabel precision sounds interesting - maybe it would make sense to allow the users to define variable to tweak the speed/accuracy level (something like marker-placement-accuracy)?

kocio-pl commented 7 years ago

Ideally we'd want any placement algorithm to be invariant under rotations. This means that rotating the polygon and labeling would be the same as labeling the polygon and then rotating the polygon and label.

I'm not sure now if that's really ideal solution in this case. Mapnik produces static maps, which will not be rotated, so this is not the main concern. On the other hand most of the scripts I know are horizontal and it makes sense to take it onto account (for vertical scripts algorithm should prefer vertical space), so the biggest possible text box area could be placed inside the polygon.

I think the optimal algorithm could start with a text box dimensions and try to place it inside the polygon as much as it's possible and if it fits, it should find the best place inside.

kocio-pl commented 7 years ago

Another possibility would be to add text box rotation if text box area does not fit.

This algorithm uses random samples to speed up the process (samples could be pseudo-random to make labels predictable) and could be also used as a basic test (with rotation=0):

https://d3plus.org/blog/behind-the-scenes/2014/07/08/largest-rect/

kocio-pl commented 6 years ago

Update: as of 3.0.17 it's still not fixed - #3771 did not work, however there is an ongoing effort with #3811.

kocio-pl commented 6 years ago

3811 is tested and merged now, so this issue can be probably closed.

talaj commented 6 years ago

Thanks @kocio-pl, closing.