qgis / QGIS

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)
https://qgis.org
GNU General Public License v2.0
10.05k stars 2.92k forks source link

Priority of labeling text works unclear #32339

Open Mickeler opened 4 years ago

Mickeler commented 4 years ago

Priority of labeling texts seems unclear. You cannot make hierarchy of labeling rule correctly. This is very important when labeling multi-language street names.

Example I have three different labeling rules that must be set correctly. Although I change the order of importance, the end result does not change. The lowest priority rule (blue) replaced the second rule (magenta). I have set priority value 10 to green rule, 8 to magenta rule and 0 to blue rule. Blue rule still won other priority values. More information in GIS-Stackexchange

I have a test dataset with qlr-file in WeTransfer -download service:

https://we.tl/t-MayPesr8am

QGIS version 3.8.0-Zanzibar

Additional context

https://gis.stackexchange.com/questions/339001/how-to-set-priority-of-labeling

kevinstadler commented 3 years ago

Just wanted to give this a +1 and make some concrete suggestions about what sort of documentation on label priorities (and the placement engine generally) are currently missing but would be good to have (based on my experiences/observations using the engine).

(Maybe if @nyalldawson could chip in and informally answer some of the questions here then the wider community could take up the work of getting the information included in the QGIS manual?)

  1. Since 3.12 (or earlier?) there is a choice of two 'Project labeling versions', but I have found no information about what the difference between them is (other than that Version 2 is 'Recommended')? I had to dig into the source code to figure out that Version 2 is in fact still based on PAL, and not a completely new placement engine altogether. Just a quick one sentence summary of the differences between the two versions would be great...

  2. Speaking of PAL, given that the pal-developer mailing list hasn't seen any posts since 2016 and that the Trac bug tracking system linked on the community page on the PAL website is offline, can we assume that the version of PAL included with QGIS is the only currently maintained version of the library (but also that the publications and documentation for the original library do not necessarily describe the behaviour 'Version 2' of the engine)?

  3. The most widespread explanation of the 'label priority' that I've found is that higher-priority labels will be categorically preferred over lower-priority ones, e.g. here in the QGIS User Guide where it says:

If there are labels from different layers in the same location, the label with the higher priority will be displayed and the others will be left out.

In my experience I've found this not to be true, e.g. if there is a polygon label with a priority of say 6 for a large polygon which covers two smaller sub-polygons which each have a polygon label of priority 5, then the placement engine will choose to display the two lower-priority labels at the expense of leaving out the higher-priority one. This is only true for smallish differences in priority though, if the difference is sufficiently large (e.g. two priority 5 labels vs. one priority 7+ label) then the higher-priority label wins out over the two lower-priority ones. It all seems to boil down to how the 1-10 priority is effectively mapped onto PAL's 'cost' function, in an apparently non-linear way, e.g.: https://github.com/qgis/QGIS/blob/6237ba2ab194070862c969eb34c57dbadb26ef17/src/core/pal/pal.cpp#L320 So the question is: what is the true effect of the relative differences between different labels' priorities in deciding which one gets placed?

  1. There are further non-linearities in the priority scale which would be good to understand more, for example I have noticed that in maps with lots of other labels, labels with a priority <4 are sometimes not placed even if there would be more than enough space for them at their location. Which raises the question of what the meaning/difference of the absolute priority levels along the 0-10 scale are...

  2. Following on from the previous question, when using data-defined label priorities, what really is the allowed range of priority values that can be used? From the following code it looks like negative priority values are impossible, but is it in principle possible to assign priorities (much) greater than 10?

https://github.com/qgis/QGIS/blob/578f32a9b6ed0fe9f3d510a85a1980e7ea8c6d53/src/core/pal/feature.cpp#L1857-L1868

  1. How exactly are labels penalized based on the size of their underlying line/polygon features, and how does this penalty interact with the user-defined label priority? Can it be controlled/turned off? https://github.com/qgis/QGIS/blob/578f32a9b6ed0fe9f3d510a85a1980e7ea8c6d53/src/core/pal/feature.cpp#L1760
nyalldawson commented 3 years ago

@kevinstadler

The documentation has been updated to reflect these changes, see eg. https://github.com/qgis/QGIS-Documentation/pull/5571/files

I'd suggest checking https://docs.qgis.org/testing/en/docs/user_manual/working_with_vector/vector_properties.html#setting-the-automated-placement-engine and https://docs.qgis.org/testing/en/docs/user_manual/style_library/label_settings.html to read the updated version

Since 3.12 (or earlier?) there is a choice of two 'Project labeling versions', but I have found no information about what the difference between them is (other than that Version 2 is 'Recommended')? I had to dig into the source code to figure out that Version 2 is in fact still based on PAL, and not a completely new placement engine altogether. Just a quick one sentence summary of the differences between the two versions would be great...

Explained in the updated docs

Speaking of PAL, given that the pal-developer mailing list hasn't seen any posts since 2016 and that the Trac bug tracking system linked on the community page on the PAL website is offline, can we assume that the version of PAL included with QGIS is the only currently maintained version of the library (but also that the publications and documentation for the original library do not necessarily describe the behaviour 'Version 2' of the engine)?

Yes, upstream pal is long dead! And the QGIS version has diverged greatly from the last version of the upstream library. In fact, it's now a totally integral part of QGIS and can no longer even work as a standalone library -- it relies on the QGIS geometry engine and other QGIS-specific API.

kevinstadler commented 3 years ago

Ah that's great, thanks, I didn't see that it was already updated in testing, should've checked really!

Regarding the several-low-priority-labels chosen over one higher-priority label issue I described above I have created a little test project/dataset that exhibits the problem: label-priority-clash-test-project.zip

Below is an export of the layout included in the project which shows the erratic label placement. The colors of the polygons give an indication of their label priority: within each column there is always one (large) background polygon which has the same priority across all rows (3 in the the left-most-column, 10 in the right-most column). Within each large high-priority polygon there are two smaller polygons with a label priority identical (top row), 1 lower (middle row) or 2 lower (bottom row) than the respective large background polygon/label: exportlayout

  1. The optimal placement would be that for each polygon cluster the highest priority label should be displayed, and then possibly also one of the lower-priority labels (since two labels comfortably fit in the overall area). At some scales this is actually what happens, but at others (such as the one currently selected in the project) the two lower-priority labels are actually displayed at the expense of the higher-priority label (see polygon clusters marked by the red box). Only slightly changing anything that affects the label size and spacing, be it the label size, distance between the polygons or the scale of the map (this is particularly strange because the label size is actually specified in map units...) can 'fix' the display to either show one (centered) high-priority label or both one high- and one low-priority label instead. Based on your knowledge of the underlying placement algorithm I'm curious to know if you've got a hunch why that might be the case and how it could be fixed.. There are many obvious more optimal placements for such a simple data set, so I wonder if it's something to do with the initial candidate generation for placements?

  2. using Version 2 of the placement engine, labels of priority 4 and under are categorically not displayed at any scale, can you confirm this? My current version:

QGIS version 3.14.0-Pi QGIS code revision 9f7028fd23
Compiled against Qt 5.12.3 Running against Qt 5.12.3
Compiled against GDAL/OGR 2.4.1 Running against GDAL/OGR 2.4.1
Compiled against GEOS 3.7.2-CAPI-1.11.2 Running against GEOS 3.7.2-CAPI-1.11.2 b55d2125
Compiled against SQLite 3.28.0 Running against SQLite 3.28.0
PostgreSQL Client Version 11.3 SpatiaLite Version 4.3.0a
QWT Version 6.1.4 QScintilla2 Version 2.11.1
Compiled against PROJ 5.2.0 Running against PROJ Rel. 5.2.0, September 15th, 2018
OS Version macOS High Sierra (10.13)
Active python plugins db_manager; MetaSearch
kevinstadler commented 3 years ago

I played around some more with the 'Show candidates' option and the sub-optimal placement did indeed boil down to the Number of Candidates default setting.

With Version 2 label placement and the number of polygon candidates increased to 4/sqcm (over the default 2.5) the general label placement did improve, however any labels with a priority of 4 or less indeed receive no candidates at all (I'm on the latest 3.14.1 now, using Version 1 placement engine on the same data set every polygon cluster received two labels as expected):

Screen Shot 2020-07-29 at 21 01 31

If I may leave some general comments/suggestions about label placement (I know this is a big and complex project and I'd be happy to contribute directly if it's possible!):

  1. the Version 2 placement engine generates no candidates for labels with priority 4 and below (no matter how much I crank up the number of candidates per sqcm). The same is true for point and line labels, so this seems to be a general bug.

  2. in terms of improving documentation, it would be great to add a note here about how the number of candidates can be expected to affect the computational performance of labeling. Does increasing the number of candidates have a linear/polynomial/exponential effect? And how does it interact with the number of features that are being labelled?

  3. I know this goes deep into the label placement engine, but the behaviour of specifying candidates per square cm is not very intuitive, and leads to odd behaviour when label sizes are not specified in canvas units but say map units (such as is the case above). It also seems strange that the number of candidates generated depends only on the size of the feature. The number of meaningfully different (both visually and in terms of avoiding clashes) candidates is probably more like a function of the actual size of the label relative to the total area in which the current placement setting would allow a label to be placed. Again using the example project above, by decreasing the scale, ever more (minusculely different) candidates are being generated, even though the label size really only allows maybe 9 meaningfully different placements for the small polygon labels. In the following screenshot there are already 180 candidates for each of the smaller polygons, most of which will realistically not contribute to finding a good placement or avoiding clashes:

Screen Shot 2020-07-30 at 15 09 48

  1. I don't know how exactly the candidates for polygon labels are generated, but when (as in my example above) there is an even number of candidates for horizontal or around centroid mode then this means that (in most cases) no candidate will be placed at the most logical (and probably also most desirable) position at the polygon center. I wonder if this could be quick-fixed by making sure that polygons (in the appropriate placement modes) will always be given an odd number of candidates, or if the centroid candidate could otherwise be hardcoded?
Pedro-Murteira commented 2 years ago

@Mickeler Hello, is this issue still valid?

github-actions[bot] commented 2 years ago

The QGIS project highly values your report and would love to see it addressed. However, this issue has been left in feedback mode for the last 14 days and is being automatically marked as "stale". If you would like to continue with this issue, please provide any missing information or answer any open questions. If you could resolve the issue yourself meanwhile, please leave a note for future readers with the same problem and close the issue. In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this issue. If there is no further activity on this issue, it will be closed in a week.

kevinstadler commented 2 years ago

@Mickeler Hello, is this issue still valid?

The label priority issue described by me in https://github.com/qgis/QGIS/issues/32339#issuecomment-666184639 is still valid in newest QGIS, you can test yourself by loading the sample project I provided: https://github.com/qgis/QGIS/files/4964892/label-priority-clash-test-project.zip

Mickeler commented 2 years ago

Not time to test now, but I will in future! Thanks!