t-rex-tileserver / t-rex

t-rex is a vector tile server specialized on publishing MVT tiles from your own data
https://t-rex.tileserver.ch/
MIT License
545 stars 68 forks source link

Parallel calls performance with geopackage as datasource #304

Closed DKKepinski closed 1 year ago

DKKepinski commented 1 year ago

Hallo,

Im from the same team as Roel Arents who created this issue: https://github.com/t-rex-tileserver/t-rex/issues/286

We've done a deeper analysis of the performance problem we were facing. In short: when we make multiple requests in parallel the performance is worse than when the same requests are done one after another. Our theory was that this is caused by geopackage locking. To test this theory, we profiled the parallel and sequential case, and this was the result (old being parallel, new being sequential) : image Green being the difference between those tests. We've noticed that a substantial amount of time was spent in the gdal library, and that it contained quite a lot of lock/unlock calls, which took a substantial amount of time. image

This seems to confirm our suspicions, so we've tried to see if we can somehow solve it. My knowledge of rust is very limited, but my attempt at a quick win gave me a 40% boost in response time, and i've been unable to notice any drawbacks: image

We've noticed that the geopackage is opened with default flags, which seems to include locking: let dataset = Dataset::open(Path::new(&self.path)).unwrap(); I have changed it like this: let mut options: DatasetOptions = DatasetOptions::default(); options.open_flags = GdalOpenFlags::GDAL_OF_UPDATE; let dataset = Dataset::open_ex(Path::new(&self.path), options).unwrap();

Which resulted in the aforementioned performance increase. I based it on this documentation i found on the internet https://gdal.org/doxygen/gdal_8h.html#a9cb8585d0b3c16726b08e25bcc94274a I expected that i would have to use the shared flag, but that somehow didnt seem possible (either i misunderstand the documentation, or my limited rust knowledge stopped me from doing that) but the update flag seems to do the job.

This seems to further confirm our suspicions that geopackage locking is the culprit.

Something else ive noticed after a quick glance at the code (again, with my limited knowledge) is that the code seems to assume that the geopackage is mutable. If you assume that the geopackage can change during runtime, then it makes sense to lock it while reading. There are multiple places where geopackage is opened in the code, rather than opening it once and keeping the connection open, which also makes sense within this assumption. It looks like to me for that a connection is opened and closed for each layer, rather than once for all layers as an example.

In our case the geopackage is immutable, so things like locking are unnecessary. I can however imagine that other people use cases include geopackages that can change during the runtime.

My question is as follows:

Is the assumption that geopackage is mutable and is this intended?

If no then we believe that a performance boost is possible by using different open flags and maybe some extra code optimizations. If it is then an option in t-rex to assume immutable geopackage (with aforementioned performance optimizations) would be of great help to us.

DKKepinski commented 1 year ago

We've run the same code change with a larger datasource on a different zoom level, and the difference was negligible. Im closing this issue until further testing that explains the difference is done.