mapbox / carto

fast CSS-like map stylesheets
https://cartocss.readthedocs.io/
Apache License 2.0
652 stars 129 forks source link

Certain selector combinations negatively affect compile time #470

Open amandasaurus opened 7 years ago

amandasaurus commented 7 years ago

I have ported the openstreetmap-carto style to vector tiles, and am using tessera, which is based on tilelive, to serve the raster tiles (which are created from vectortiles). However it takes a very long time to start up (~4 minutes) and that's from carto's processing of the cartocss files. I initially reported this on the tessera issue tracker, see more background there

Using simple print statement debugging, I can see that some of the longest parts are inside the rules = inheritDefinitions(matching, env); call. For some osm-carto layers, there are thousands of items in the matching variable. (e.g. test-polyg-low-zoom has 6,005. bridges has 1,099, test-poly as 6,064 etc). It takes about 1 second (on my machine) for every 1,000 items. so some of these layers take about 5sec to process. There are lots of layers, so it's about 3½ minutes to parse the whole file.

A weird thing is that I can use the same node_modules to parse the upstream project.mml into mapnik XML in 24sec. This is only a problem when parsing a vector tiles file.

I have patched tilelive-tm{style,source} to use the latest carto (0.17.2).

You can test this yourself by checking out the openstreetmap-carto-vector-tiles project and doing make tessera (which will serve the tiles after a while).

nebulon42 commented 7 years ago

I'm undecided if this is a bug or not, but this would need improvement. Thanks for the investigation so far. openstreetmap-carto has slow layers also in the Mapnik XML case but not that slow. I know not enough about the vector workflow to say what could be different there. Needs more investigation.

amandasaurus commented 7 years ago

Some more "print statement debugging" tells me that for the upstream project, the layers have much less items in the matching vartiable. text-poly-low-zoom only has 1,016 layers, bridges has 951, text-poly has 1.054. This is probably why the non-vector-tile version is much faster (30sec vs 3½min).

I know openstreetmap-carto is complicated and slow, but it's the difference which is the bug for me. If it took about the same time, that would be one thing.

To me, this is a bug, since it negatively affects what should be a valid work flow.

nebulon42 commented 7 years ago

Ok, let's call it a bug then.

nebulon42 commented 7 years ago

Doesn't run for me. make tessera failed. I have singled out the command below.

$ ./node_modules/.bin/tessera -c tessera-serve-vector-tiles.json
/dir/openstreetmap-carto-vector-tiles/node_modules/tilelive-cache/index.js:232
    uri.query = uri.query || {};
                   ^

TypeError: Cannot read property 'query' of undefined
    at Object.<anonymous> (/dir/openstreetmap-carto-vector-tiles/node_modules/tilelive-cache/index.js:232:20)
    at Object.load (/dir/openstreetmap-carto-vector-tiles/node_modules/locking-cache/index.js:85:17)
    at /dir/openstreetmap-carto-vector-tiles/node_modules/tessera/server.js:100:16
    at Array.forEach (native)
    at module.exports (/dir/openstreetmap-carto-vector-tiles/node_modules/tessera/server.js:88:25)
    at Object.<anonymous> (/dir/openstreetmap-carto-vector-tiles/node_modules/tessera/bin/tessera.js:82:30)
    at Module._compile (module.js:570:32)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)

Have you seen that before?

amandasaurus commented 7 years ago

I've seen that. You can work around it by running the tmsource on port 8081 in one tessera process/terminatl window, and the tmstyle in another window. I'll post full details on Monday

amandasaurus commented 7 years ago

I have fixed the my vector tiles project, so make tessera will work now. Please pull from the current master.

nebulon42 commented 7 years ago

Some random notes not necessarily related to the behaviour, primarily for myself:

nebulon42 commented 7 years ago

I have turned on benchmarking mode for carto, which you can achieve if you call new carto.Renderer({ benchmark: true }, {}).render(opts) in tilelive-tmsource. It reveals that the following layers are quite slow:

processing layer: text-poly-low-zoom Inheritance time: 43857ms ... processing layer: text-poly Inheritance time: 54167ms ... processing layer: text-point Inheritance time: 53358ms

These three layers alone take nearly 3 minutes on my machine.

For comparison compiling osm-carto with the benchmark option on the command line gives:

processing layer: text-poly-low-zoom Inheritance time: 1411ms ... processing layer: text-poly Inheritance time: 2062ms ... processing layer: text-point Inheritance time: 2044ms

nebulon42 commented 7 years ago

The reason seems to be that in the vector case the layer classes that osm-carto uses are not correctly set and thus not passed on to the renderer. This may lead to inheritance with way too many matching rules. Likely not a carto bug, but something that has to be done differently in osm-carto.tm2/project.yml.

The downstream version of osm-carto does not use classes, so this statement was invalid.

EDIT: changed project.yml e.g. - {id: text-poly-low-zoom, class: text-low-zoom}, but no effect, strange

amandasaurus commented 7 years ago

Is there a way to easily change datasource parameters (db, user, password) and not having to change them in osm-carto.tm2source/data.yml?

If you change the project.mml in the root of the directory and re-run the make.

tilelive-tmsource/tilelive-tmstyle uses an outdated version of carto (0.14.x)

Yes. I have changed tilelive-tmsource & -tmstyle to use "latest", and the same problem occurs.

amandasaurus commented 7 years ago

Further investigation implies that this isn't a bug in carto, per se, but that some of the changes to support vector tiles made the CartoCSS much more complicated, with more permutations, and hence it takes carto much longer to process and parse the file. You can see this if you open up the style files for upstream osm-carto and the vector tiles in a diff viewer (e.g. meld), and copy some of the changes to the original osm-carto style. You'll see the carto command takes much longer, and see that there are more matching values to process¹

Specifically look at the commits/patches to support way_pixels, or to remove all carto classes and use ids instead.

Which implies there's almost no way to fix this problem, short of making carto much faster....


¹ I have put this line in after L43 of lib/carto/renderer.js and it will print out the number of matching values.

console.warn("Matching l.name="+String(l.name)+" has "+matching.length+" values");
nebulon42 commented 7 years ago

Ah, you removed the classes. I have not seen that. That means that I was on the wrong track. Until now I thought there have not been a lot of changes in the style. But now that means I cannot compare the two variants of the style anymore.

nebulon42 commented 7 years ago

I think https://github.com/geofabrik/openstreetmap-carto-vector-tiles/commit/fb863ad might be (one of) the reason/s for the rule explosion in the text layers.

nebulon42 commented 7 years ago

There are some shortcomings in carto that require one to be careful what kind of CartoCSS is written. See e.g. #206 or #20. This needs improvement, but is hard.

amandasaurus commented 7 years ago

The following simple patch on the current HEAD of osm-carto (8be18f047) makes the text-poly-low-zoom have 4,760 (from ~1,000), and makes the whole carto process take 2½ minutes.

diff --git a/amenity-points.mss b/amenity-points.mss
index 0cba35b..888835c 100644
--- a/amenity-points.mss
+++ b/amenity-points.mss
@@ -1597,18 +1597,22 @@
   [feature = 'leisure_track'],
   [feature = 'leisure_dog_park'],
   [feature = 'leisure_pitch'] {
-    [zoom >= 10][way_pixels > 3000][is_building = 'no'],
+    [zoom >= 10][way_pixels > 3000][is_building = 'no'], [is_building = 'no'][zoom=14][way_pixels > 3000], [is_building = 'no'][zoom=15][way_pixels > 750], [is_building = 'no'][zoom=16][way_pixels > 187],
     [zoom >= 17][is_building = 'no'],
-    [zoom >= 10][way_pixels > 3000][feature = 'shop_mall'],
+    [zoom >= 10][way_pixels > 3000][feature = 'shop_mall'], [feature = 'shop_mall'][zoom=14][way_pixels > 3000], [feature = 'shop_mall'][zoom=15][way_pixels > 750], [feature = 'shop_mall'][zoom=16][way_pixels > 187],
     [zoom >= 17][feature = 'shop_mall'] {
       text-name: "[name]";
       text-size: @landcover-font-size;
       text-wrap-width: @landcover-wrap-width-size;
-      [way_pixels > 12000] {
+      [way_pixels > 12000],
+      [zoom=14][way_pixels > 12000], [zoom=15][way_pixels > 3000], [zoom=16][way_pixels > 750], [zoom=17][way_pixels > 187], [zoom=18][way_pixels > 46], [zoom>=19][way_pixels > 11]
+      {
         text-size: @landcover-font-size-big;
         text-wrap-width: @landcover-wrap-width-size-big;
       }
-      [way_pixels > 48000] {
+      [way_pixels > 48000],
+      [zoom=14][way_pixels > 48000], [zoom=15][way_pixels > 12000], [zoom=16][way_pixels > 3000], [zoom=17][way_pixels > 750], [zoom=18][way_pixels > 187], [zoom>=19][way_pixels > 46]
+      {
         text-size: @landcover-font-size-bigger;
         text-wrap-width: @landcover-wrap-width-size-bigger;
       }
nebulon42 commented 7 years ago

Yes, I have also tried reverting https://github.com/geofabrik/openstreetmap-carto-vector-tiles/commit/fb863ad and the performance is back to "normal" for the text layers. Of course reverting is no option, but until #206, #20 or other problems are fixed, there have to be measures taken in the style to keep the performance in reasonable bounds. Help with improvements is always welcome. I myself am also quite late to the carto party and do not have insight into everything.

nebulon42 commented 7 years ago

@rory Side note: You could also try magnacarto a Go implementation of CartoCSS, which I have made available to Node.js. It works with Kosmtik by installing the kosmtik-magnacarto plugin and specifiying --renderer magnacarto.

While it works well with the raster approach I just tried the vector approach but could not get it to work because the parser is called twice, once with Stylesheet property and once without. While carto works fine without this property magnacarto does not like that. You could also try plugging it into tessera if you like that better. magnacarto parses generally faster and I have not seen any longer parsing times with your modifications to osm-carto.

amandasaurus commented 7 years ago

That's an interesting suggestion. However my nodejs knowledge is current not good enough for that. :)

amandasaurus commented 7 years ago

I can probably improve my CartoCSS selectors. In this example I have one nested zoom selectors, which is probably causing a combinatorial expansion in terms, and hence causing the slow down.