Open treee111 opened 1 year ago
Two performance tests on Google cloud, using 22 vCPU with 44GB RAM CPU shown as : Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz Google instance type : c3-highcpu-22
10GB RAM was used as RAM disk using tmpfs, where the _tiles directory was placed, since I've found that it will increase the performance for the "Split filtered country files to tiles" part greatly.
For Germany
INFO:+ Filter tags from country osm.pbf files: OK, took 0.00142, 229.55468
INFO:+ Generate land for each coordinate: OK, took 0.02954, 25.73160
INFO:+ Generate sea for each coordinate: OK, took 0.00348, 0.00348
INFO:+ Split filtered country files to tiles: OK, took 0.05190, 621.29996
INFO:+ Merge splitted tiles with land, elevation, and sea: OK, took 0.05836, 308.47714
INFO:+ Creating .map files for tiles: OK, took 0.05445, 641.21863
INFO:+ Zip .map.lzma files: OK
INFO:+ Create .map.lzma files: OK, took 0.29313, 13.28629
INFO:Total time 7.78153, 1861.73668
For Great Britain
INFO:+ Filter tags from country osm.pbf files: OK, took 0.00131, 86.73431
INFO:+ Generate land for each coordinate: OK, took 0.06786, 59.63969
INFO:+ Generate sea for each coordinate: OK, took 0.00652, 0.00653
INFO:+ Split filtered country files to tiles: OK, took 0.10295, 646.76114
INFO:+ Merge splitted tiles with land, elevation, and sea: OK, took 0.11657, 343.61082
INFO:+ Creating .map files for tiles: OK, took 0.09248, 533.82919
INFO:+ Zip .map.lzma files: OK
INFO:+ Create .map.lzma files: OK, took 0.16292, 6.72935
INFO:Total time 9.09717, 1700.81795
From CPU graphs, the "Split filtered country files to tiles" part seems to almost CPU bound, with 75% total CPU usage.
So I think maybe the "Merge splitted tiles with land, elevation, and sea" part should be looked at first for improving performance, that part does not seem to be close to CPU bound currently. But I've also seen twice that that section gets stuck on one tile, when generating for Norway, which has about 1200 tiles to merge. I will probably raise a separate issue for that problem.
It would also be nice to have some performance numbers from others, to see where the main bottlenecks are.
Currently prototyping this in #235
Is your feature request related to a problem? Please describe.
A clear and concise description of the problem, e.g. "I'm always frustrated when {...}"
Describe the solution you'd like
Paralyze calls in each processing step of the threads do not occupy the same Ressource Implement time measurement before to track it
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.