marceloprates / prettymaps

A small set of Python functions to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries.
GNU Affero General Public License v3.0
11.04k stars 516 forks source link

Very slow? #22

Open chtenb opened 2 years ago

chtenb commented 2 years ago

It takes about ten minutes to render the first example, which is a really long time. Not something you want to have when iteratively improving your map. The strange thing is, during those ten minutes it consumes almost no resources.

image

Any idea what it is doing all that time?

TaylorBurnham commented 2 years ago

I went through and tossed some time.perf_counter() calls into a function and placed it throughout the draw.py and fetch.py to see what it was doing. A lot of the time is spent querying using osmnx and this is to be expected because it's a public endpoint which doesn't require authorization.

Turning on logging for osmnx via ox.config(log_file=True) shows a little more what's happening under the hood and it's throttling itself. You will see this multiple times throughout if you flip that on.

2021-08-27 12:26:45,333 DEBUG OSMnx Requesting data within polygon from API in 1 request(s)
2021-08-27 12:26:45,614 DEBUG OSMnx Pausing 56 seconds before making HTTP POST request
2021-08-27 12:27:41,671 DEBUG OSMnx Post <snip> with timeout=180
2021-08-27 12:27:46,574 DEBUG OSMnx Downloaded 200.4kB from overpass-api.de
2021-08-27 12:27:46,588 DEBUG OSMnx Saved response to cache file "cache/59a345ca2b319ed9db7346fa97fa25786b34774e.json"

If you note the last line it does save to a cache, and if you run it multiple times it should be faster on the second and subsequent runs because it will be using the cached data rather than pulling it down again. Depending on what you tweak it may need to call the endpoints to pull down new layers, but it shouldn't have to pull everything.

You can read more about the Overpass API here, but throttling like this is in place to keep people from hammering endpoints and crashing them. If you want to run this without having to worry about throttling then you also have the option of setting up your own server, but it's not necessary for most people.

My quick and dirty performance counter just printed the difference between the last run, and the result is further down.

import time
from datetime import datetime

def _print_perf(p, msg):
    n = time.perf_counter()
    d = n - p
    print(f"{datetime.now()}\t{__name__}\t{d:.2f}\t{msg}")
    return n

n = time.perf_counter()
n = _print_perf(n, "Starting Function")
...
n = _print_perf(n, "Time Difference")

Resulted in this:

12:10:50.57      __main__                0.00    Initialized
12:10:50.59      __main__                0.02    Creating Plot
12:10:50.59      prettymaps.draw         0.00
12:10:50.59      prettymaps.draw         0.00    Building base_kwargs
12:10:51.95      prettymaps.draw         1.36    Fetching Layers from fetch.py
12:10:51.95      prettymaps.fetch        0.00    Getting Layers
12:10:51.95      prettymaps.fetch        0.00    Running ox.graph_to_gdfs
12:13:40.13      prettymaps.fetch        168.18  Done running ox.graph_to_gdfs
12:13:40.13      prettymaps.fetch        0.00    Getting Layers
12:13:40.13      prettymaps.fetch        0.00    Getting Street
12:14:13.72      prettymaps.fetch        33.59   Done
12:14:13.72      prettymaps.fetch        0.00    Getting Layers
12:14:13.72      prettymaps.fetch        0.00    Getting Geometries
12:16:15.26      prettymaps.fetch        121.54  Done
12:16:15.26      prettymaps.fetch        0.00    Getting Layers
12:16:15.26      prettymaps.fetch        0.00    Getting Geometries
12:16:53.27      prettymaps.fetch        38.01   Done
12:16:53.27      prettymaps.fetch        0.00    Getting Layers
12:16:53.27      prettymaps.fetch        0.00    Getting Geometries
12:18:19.00      prettymaps.fetch        85.73   Done
12:18:19.00      prettymaps.fetch        0.00    Getting Layers
12:18:19.00      prettymaps.fetch        0.00    Getting Geometries
12:18:21.12      prettymaps.fetch        2.12    Done
12:18:21.12      prettymaps.fetch        0.00    Getting Layers
12:18:21.12      prettymaps.fetch        0.00    Getting Geometries
12:19:12.99      prettymaps.fetch        51.87   Done
12:19:12.99      prettymaps.draw         501.05  Done fetching layers
12:19:12.99      prettymaps.draw         0.00    Transforming Layers
12:19:13.00      prettymaps.draw         0.01    Done
12:19:13.00      prettymaps.draw         0.00    Drawing Background
12:19:13.00      prettymaps.draw         0.00    Done
12:19:13.00      prettymaps.draw         0.00    Drawing Layers on Canvas
12:19:13.00      prettymaps.draw         0.00    Drawing Layer...
12:19:13.00      prettymaps.draw         0.00    Drawing Layer...
12:19:13.10      prettymaps.draw         0.10    Drawing Layer...
12:19:15.13      prettymaps.draw         2.03    Drawing Layer...
12:19:15.13      prettymaps.draw         0.01    Drawing Layer...
12:19:15.16      prettymaps.draw         0.03    Drawing Layer...
12:19:15.16      prettymaps.draw         0.00    Drawing Layer...
12:19:15.26      prettymaps.draw         0.10    Done - Returning layers
12:19:15.26      __main__                504.67  Done Generation
12:19:15.26      __main__                0.00    Saving PNG
12:19:17.97      __main__                2.71    Saving SVG
12:19:20.67      __main__                2.70    Done
setop commented 2 years ago

I went through and tossed some time.perf_counter() calls into a function and placed it throughout the draw.py and fetch.py to see what it was doing. A lot of the time is spent querying using osmnx and this is to be expected because it's a public endpoint which doesn't require authorization.

Same here, I made some profiling and most time is spent waiting for OSM data to be fetched over the network.

flamegraph

sudhamsugurijala commented 2 years ago

Im working on this issue, and will respond with a solution (if I can find one).