Closed tcihak-fqa closed 1 year ago
Thanks very much for the suggestion. I think that is technically possible and would amount to parallelising this for
loop:
However, I'm not sure parallelisation would speed up things. It would if the bottleneck was in processing the tiles, but my hunch is that most of the time is spent on download/latency issues. If that's the case, wouldn't parallelisation not solve it (if anything add more downloading load for the same bandwidth)?
Thanks for the response! So I'm using contextily to generate static maps from a web api that has high bandwidth. I agree that most of the time is spent on I/O latency.
I have actually been considering making a pull request for this exact functionality. I've been playing around with it in another project, and for large images it can make a rather big difference. The optimal number of parallel downloads differ quite a bit from endpoint to endpoint, but 8-16 is normally a good range. The code below can be used as a drop-in replacement for the for loop:
def bounds2img(
w, s, e, n, zoom="auto", source=None, ll=False, wait=0, max_retries=2, num_parallel_tile_downloads=16
):
.
.
.
# download and merge tiles
# tiles = []
# arrays = []
# for t in mt.tiles(w, s, e, n, [zoom]):
# x, y, z = t.x, t.y, t.z
# tile_url = provider.build_url(x=x, y=y, z=z)
# image = _fetch_tile(tile_url, wait, max_retries)
# tiles.append(t)
# arrays.append(image)
from joblib import Parallel, delayed # This should go to the top of the file
tiles = list(mt.tiles(w, s, e, n, [zoom]))
tile_urls = [provider.build_url(x=tile.x, y=tile.y, z=tile.z) for tile in tiles]
max_num_parallel_tile_downloads = 32
# Note that num_parallel_tile_downloads has been added as an argument to the function
if num_parallel_tile_downloads < 1 or num_parallel_tile_downloads > max_num_parallel_tile_downloads:
raise ValueError(
f"num_parallel_tile_downloads must be between 1 and {max_num_parallel_tile_downloads}"
)
arrays = \
Parallel(n_jobs=num_parallel_tile_downloads, prefer="threads")(
delayed(_fetch_tile)(tile_url, wait, max_retries) for tile_url in tile_urls)
merged, extent = _merge_tiles(tiles, arrays)
.
.
.
I just tested it in the intro_guide.ipynb
notebook by downloading an extended version of the ghent image in the Coordinate-based searches section, with the following code:
west, south, east, north = (
3.616218566894531,
50.98912458110244,
5.8483047485351562,
54.13994019806845
)
import time
start_time = time.time()
ghent_img, ghent_ext = cx.bounds2img(west,
south,
east,
north,
ll=True,
zoom=11,
source=cx.providers.Stamen.Toner,
num_parallel_tile_downloads=8
)
print(f"Download time: {time.time() - start_time}")
ghent_img.shape
Note that I had to out-comment the @memory.cache
decorator for the _fetch_tile
function during the test.
The shape of the image was (7680, 3584, 4)
, and the download times were:
num_parallel_tile_downloads=1: 69.49s
num_parallel_tile_downloads=2: 36.11s
num_parallel_tile_downloads=4: 18.32s
num_parallel_tile_downloads=8: 9.55s
num_parallel_tile_downloads=16: 5.02s
num_parallel_tile_downloads=32: 2.92s
I don't think there should be any downside to always do it in parallel (the overhead should be minimal), as long as the number of parallel downloads don't bomb the endpoint. A default value of 16 is a good starting point in my experience, with the different endpoints I've tested. That normally gives almost linear improvements. Above that, it differs quite a bit.
That ended up being quite a bit of text. Hopefully it's useful :wink:
Let me know if you would like a pull request with the implementation :+1:
Just made a couple more tests with smaller images:
At zoom=9
the shape was (2048, 1024, 4)
and download times were:
num_parallel_tile_downloads=1: 5.38s
num_parallel_tile_downloads=16: 0.50s
At zoom=7
the shape was (512, 512, 4)
and download times were:
num_parallel_tile_downloads=1: 0.69s
num_parallel_tile_downloads=16: 0.17s (there's only 4 tiles so in practice we only do 4 parallel downloads)
I also tested the three zoom levels with the current for loop implementation. The download times were pretty much exactly the same as with num_parallel_tile_downloads=1.
This solution looks very nice Jacob! My vote would be to create a PR and get it into a release at some point.
Thanks @tcihak-fqa :smiley:
I've just added a pull request with the changes (https://github.com/geopandas/contextily/pull/217).
If you'd like to use it now, you can install it with:
pip uninstall -y contextily
pip install git+https://github.com/JacobJeppesen/contextily@parallel_tile_downloads
Thanks Jacob. I monkey patched your original solution and it seems to be working well. I haven't encountered any memory issues but the number of api requests has been light so far.
Closed by #217
Fetching the web tiles can take a long time when there are many to download. Would it be possible to do multiple tile requests in parallel using the multiprocessing module?