Mapnik (async) query parsing takes more time than actual execution

ardean80 commented 7 years ago

I'm using python-mapnik. I noticed that, for each query/layer in my Mapnik project.xml file, PostGIS logs several parse/bind/execute query durations and that, even for 1 ms lasting queries, the parse time is always around 60 ms, much more than the query actual execution time. Is there any solution for this? Some optimization or parameter I can setup on Mapnik/PostgreSQL? A little example:

LOG: duration: 2.403 ms statement: SELECT * FROM (select osm_id, way from planet_osm_polygon where highway = 'pedestrian') as layer_pedestrian_square LIMIT 0

LOG: duration: 49.792 ms parse <unnamed>: SELECT ST_AsBinary(ST_Simplify("way", 0.119433)) AS geom,"osm_id" FROM (select osm_id, way from planet_osm_polygon where highway = 'pedestrian') as layer_pedestrian_square WHERE "way" && ST_SetSRID('BOX3D(1387382.005813842 5144436.025206606,1389120.948207087 5146174.967599851)'::box3d, 3857)

LOG: duration: 1.604 ms bind <unnamed>: SELECT ST_AsBinary(ST_Simplify("way", 0.119433)) AS geom,"osm_id" FROM (select osm_id, way from planet_osm_polygon where highway = 'pedestrian') as layer_pedestrian_square WHERE "way" && ST_SetSRID('BOX3D(1387382.005813842 5144436.025206606,1389120.948207087 5146174.967599851)'::box3d, 3857)

LOG: duration: 2.979 ms execute <unnamed>: SELECT ST_AsBinary(ST_Simplify("way", 0.119433)) AS geom,"osm_id" FROM (select osm_id, way from planet_osm_polygon where highway = 'pedestrian') as layer_pedestrian_square WHERE "way" && ST_SetSRID('BOX3D(1387382.005813842 5144436.025206606,1389120.948207087 5146174.967599851)'::box3d, 3857)

Postgres logs 4 lines (one for each "phase", statement/parse/bind/execute) for each query. I suppose this is the normal behavior for the queries mapnik finds in its project.xml file and that it treats as prepared statements, as it adds the "where way && !bbox!" clause, where the bbox placeholder may vary depending on the zone to be rendered and displayed. The original query from the project.xml file is:

(select osm_id, way from planet_osm_polygon where highway = 'pedestrian') as layer_pedestrian_square

What is strange is that for every query the parse time is "huge", 50-70 ms, especially when the execute time is 1-3 ms. And that parse time (50-70 ms) is the same both for such short queries and for much longer ones. If I run the logged query with the psql command I have this log:

LOG: duration: 4.675 ms statement: SELECT ST_AsBinary(ST_Simplify("way", 0.119433)) AS geom,"osm_id" FROM (select osm_id, way from planet_osm_polygon where highway = 'pedestrian') as layer_pedestrian_square WHERE "way" && ST_SetSRID('BOX3D(1387382.005813842 5144436.025206606,1389120.948207087 5146174.967599851)'::box3d, 3857);

So, what am I doing the wrong way? Is it possible that for every request Mapnik spends so much time in parsing (the same) queries?

talaj commented 7 years ago

You are right, Mapnik doesn't use PQexecPrepared. Would be great to know what is the reason for that.

springmeyer commented 7 years ago

Interesting investigation - thanks for sharing your results. The parse time is a postgres thing, not mapnik, so I'm not sure if we can do anything about that. Perhaps you can research more about how postgres does parsing and how it measures parse time?

The only other thing that comes to mind to mention is that yes, the where way && !bbox! style queries are the ones that Mapnik runs everything time you render a map (triggered by the featureset getting created). However Mapnik also needs to run a variety of queries to get metadata about the postgres table. Like the LIMIT 0 one, which is used to collect attributes and attribute types. These queries are run with the datasource is created, which is during XML loading. For best performance you need to load the XML only once into a map and then render from that same map many times. The gocha however is that maps are not designed to be rendered at the same time by multiple threads. So if you are multithreading you need to keep a pool of maps and only call map.render() on a single map at a time.

ardean80 commented 7 years ago

Thank you so much for your help. :) If I've got what you mean, the "LIMIT 0" queries are the ones launched when the project.xml file is loaded, that is what happens with python-mapnik's mapnik.load_map(...) method. In my execution, I have several worker processes listening for requests, and each one of them has a map property, that is created only once (when the worker is launched) and the load_map method is invoked only in that moment. Indeed in this moment I can (correctly) see many "LIMIT 0" queries in my Postgres logs. Later, when a request is received by a worker process, the mapnik.render(...) method is invoked, in order to render the map, depending on some variable parameters (bbox, size, etc.), and using the previously instantiated map object of the serving process. In this phase I see all the parse/bind/execute logs, where in particular the parse time is very long. I suppose it is a Postgres issue too. Maybe it is due to the fact that Postgres doesn't share parsed queries through multiple connections; so in case of multiple async queries, when more connections are opened/closed, this "heavy" parsing process is repeated multiple times (even for identical queries) and this may affect performances, causing global execution times to raise up, especially with respect to a "serial" execution (max_async_connection = 1) where the same connection is used for all the queries, allowing so to share the same parsed query and to speedup the following query parsing. In this way, only the first query prse takes a long time, while the following ones take a trascurable amount of time (~0.5 ms). Therefore, the result I've obtained is that, in my case, the "serial" (max_async_connection = 1) query mode is faster than the "async" (max_async_connection > 1) one. Unfortunately, the "serial" mode has a scalability issue: the connection estabilished by the worker process at its launch is the same used to execute all the queries (this makes it fast, as seen before). The problem is that, when the database to which it is connected for some reason goes out of the cluster, mapnik doesn't become aware of this and it continues to be connected to a "dead" database. This does not happen with async queries, because this mode allows to open a connection to send some async queries and then it closes such connection. I also tried several configurations, changing the value of the persist_connection parameter too, but I have not got any improvements.

mapnik / mapnik-support

Mapnik (async) query parsing takes more time than actual execution #85