DHI / terracotta

A light-weight, versatile XYZ tile server, built with Flask and Rasterio :earth_africa:
https://terracotta-python.readthedocs.org
MIT License
697 stars 76 forks source link

Make drivers thread-safe #246

Open dionhaefner opened 2 years ago

dionhaefner commented 2 years ago

Ultimately, we would like to return to multi-threaded access to rasters and metadata. For this we would need to use SQLalchemy's threading mechanism.

mrpgraae commented 2 years ago

Since all of the potential gains from multi-threaded access lie in I/O operations, I'm wondering if it would not make more sense to move towards asyncio and async / await, instead of multi-threading?

I have some experience with SQLAlchemy's new async interface together with asyncpg, so I could be of help here.

Even async with MySQL is slowly maturing with https://github.com/aio-libs/aiomysql though it's not officially supported by SQLAlchemy yet.

dionhaefner commented 2 years ago

The main motivation for this is to have multi-threaded raster access; database queries are never the bottleneck in Terracotta, so no need to optimize here. So unless rasterio / GDAL goes through some serious paradigm shift (i.e., starts supporting non-blocking IO) I don't see much value in this over threading.

mrpgraae commented 2 years ago

Right, GDAL would be the blocker here. Seems very unlikely that they would decide to make an async interface.

mrpgraae commented 2 years ago

I could still see some combination of asyncio and threading that would be beneficial.

For example, using an ASGI framework for handling requests, then doing raster access in an external thread which is then awaited. This would mean that Terracotta could serve multiple requests asynchronously.

This would probably not do anything for single-request latency, but it should, in theory, mean that Terracotta would be able to handle many more requests per second. Might also improve single-request latency when under heavy load.

It's a big refactor and I'm not 100% sure if it can be done in a backwards compatible way, though at first glance it should be possible.

dionhaefner commented 2 years ago

I'm not against async in principle, but in this case the work should start with async raster retrieval, not database accesses. Also we'd need to dump flask? Seems like a huge undertaking to me, and given that GDAL still seems to have issues with multithreading this is hypothetical anyway.

mrpgraae commented 2 years ago

I'm not against async in principle, but in this case the work should start with async raster retrieval, not database accesses.

Yes that's also what I'm suggesting, but to wrap the multithreading inside an async interface, so that it would work within an ASGI web framework. I'm just saying that for async raster retrieval to really pay off, requests should be handled async as well.

Also we'd need to dump flask? Seems like a huge undertaking to me, and given that GDAL still seems to have issues with multithreading this is hypothetical anyway.

Yes, it would be a lot of work, I'm not saying this is something we should definitely do. I'm just thinking out loud and airing ideas :slightly_smiling_face:

And yes, it would hinge on GDAL fixing the issues with multithreading. As I understand the issue, GDAL is supposed to handle multi-threaded reading, but there is a race-condition bug somewhere which means that this doesn't work in practice?

dionhaefner commented 2 years ago

As I understand the issue, GDAL is supposed to handle multi-threaded reading, but there is a race-condition bug somewhere which means that this doesn't work in practice?

Yes, or we don't know. Since the bug is so elusive you only notice it in heavy use (i.e. a production setting), so it may actually be fixed already...