treee111 / wahooMapsCreator

Create maps for Wahoo device based on latest OSM maps
247 stars 25 forks source link

A collection of performance improvements #231

Open alfh opened 9 months ago

alfh commented 9 months ago

Is your feature request related to a problem? Please describe.

I would like to improve the performance of the map generation, making sure that my CPU is working as hard as possible, to generate the maps of large and small countries.

Manual tests shows me that it would be possible on my PC, to go from 3 days to 30 minutes for generating Canada, which has almost 5000 tiles. (That does not include contour lines, I have not tested that yet).

Describe the solution you'd like

These are just my high level thoughts as of now. I think separate feature requests should be raised for each bullet point, but want some initial feedback on the points, and the loosely planned "road" to better performance. I am really open to input here, to what would be wise, and to what sequence we should tackle the issues / sub-issues.

  1. Add a Benchmark.md file or similar, where one can list the time taken for various countries, using various PCs and specific version of wahooMapsCreator. This to give us visibilty of any performance improvements, and give users an idea of what performance to expect.
  2. Use osmium id renumbering on the two files generated in filter_tags_from_country_osm_pbf_files. This would take a few seconds, but it would reduce the memory use in the upcoming "extract" steps, since that depends quite a bit on the "highest IDs in the files"
  3. Start using asyncio to launch external programs. This has the most potential, and is probably quite easy. This would allow us to have "CPU count minus 1" external programs running in parallel for at least some of the steps. Some of the steps are not really multi threaded, so it will only use one CPU core.
  4. Split the actual invoking of the external program for each step, into a separate method, this to allow each step to have a "for" loop, and using asyncio to set up tasks, and then await at the end
  5. For the extract step, generate a JSON file with X number of tiles, and then invoke osmium extract with it. In the beginning this could be just 5-10 tiles in each batch (the memory usage is proportional to the number of tiles). This will be a great time saver I think for large countries.
  6. For the extract step, going further, the number of tiles in the batch could be increased, so that (total number of tiles) / (number of CPU cores - 1), is used as the batch size. But here the memory requirement increases, but I am also looking into changes on the osmium itself, which would really decrease that memory usage 10x (but currently with some performance degradation).
  7. Add more unit tests for countries (not requiring the large pbf files to be checked in), so we see that our changes does not affect the resulting map files.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Consider if any minimum RAM requirements should be set, like 8 GB perfhaps ?

https://realpython.com/async-io-python/ is a good article on asyncio, in addition to the Python manual. Also looked a bit on trio, https://trio.readthedocs.io/en/stable/, but I think the standard asyncio is good enough for our use.

alfh commented 9 months ago

Ref 3 : I'm prototyping the use of asyncio in a prototype branch of mine for now : https://github.com/alfh/wahooMapsCreator/tree/use_asyncio_for_external_programs