Low CPU usage for Tippecanoe on EC2

damg22 commented 4 months ago

Currently attempting to run Tippecanoe on EC2 with a very large geojson file (~110GB), this file takes too long to progress (an indefinite amount that is at least more than 48 hours). After a lot of searching for root causes, once I ran 'top', I noticed Tippecanoe was using only 1 of 16 provided cores. When I ran 'top' locally, on mac os, Tippecanoe was using 7 cores, which would explain why it was so much faster locally. After playing around with Tippecanoe and reading your docs, I noticed the TIPPECANOE_MAX_THREADS argument, I set the threads to 16, one per core, and it seems like this briefly raised the CPU usage to 16, but after it gets to 99.9% reading, the cpu usage drops to only 1 core, this causes the job completion to take days. Do you have any recommendations or help that you could provide in debugging this issue?

DeepakSharda commented 4 months ago

Yes this is useful if we can pass on as a parameter to use number of threads can speed up the process. Data is growing really big now a days.

On Sat, 16 Mar, 2024, 06:03 Diego, @.***> wrote:

Currently attempting to run Tippecanoe on EC2 with a very large geojson file (~110GB), this file takes too long to progress (an indefinite amount that is at least more than 48 hours). After a lot of searching for root causes, once I ran 'top', I noticed Tippecanoe was using only 1 of 16 provided cores. When I ran 'top' locally, Tippecanoe was using 7 cores, which would explain why it was so much faster locally. After playing around with Tippecanoe and reading your docs, I noticed the TIPPECANOE_MAX_THREADS argument, I set the threads to 16, one per core, and it seems like this briefly raised the CPU usage to 16, but after it gets to 99.9% reading, the cpu usage drops to only 1 core, this causes the job completion to take days. Do you have any recommendations or help that you could provide in debugging this issue?

— Reply to this email directly, view it on GitHub https://github.com/felt/tippecanoe/issues/218, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUZTV6MIEKEJPPRQTCSILJ3YYOHNJAVCNFSM6AAAAABEY26VOCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4DSNRVG43TIMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mtravis commented 4 months ago

Try converting the Geojson to Flatgeobuf using ogr2ogr and then running Tippecanoe.

FGBs are smaller, stream quicker and runs jobs in parallel by default.

Hope that helps

Matt

DeepakSharda commented 4 months ago

Will that help in Tile-Join as well. Thanks for the the Tip.

On Sat, 16 Mar, 2024, 12:22 Matt Travis, @.***> wrote:

Try converting the Geojson to Flatgeobuf using ogr2ogr and then running Tippecanoe.

FGBs are smaller, stream quicker and runs jobs in parallel by default.

Hope that helps

Matt

— Reply to this email directly, view it on GitHub https://github.com/felt/tippecanoe/issues/218#issuecomment-2001879504, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUZTV6L4USE6WB7AZ5PZ52LYYPT2BAVCNFSM6AAAAABEY26VOCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBRHA3TSNJQGQ . You are receiving this because you commented.Message ID: @.***>

mtravis commented 4 months ago

No problem. tile-join only works on mbtiles so I don't think you'd see any improvement there.

damg22 commented 4 months ago

@mtravis Thanks for the suggestion, my concern is at the end of the day i'd have to run Tippecanoe on the EC2 instance with a single core anyways. I have done a good amount of testing on the instance and it seems like this is related to a Tippecanoe implementation. Currently looking through the source code for a possible bug.

Wondering if @e-n-f has any insights on this ? Seems like the TIPPECANOE_MAX_THREADS argument simply isn't forcing more cpu usage. I can confirm the docker container has 16 cores available.

e-n-f commented 4 months ago

Tippecanoe will generally use as many CPUs as are available, even if TIPPECANOE_MAX_THREADS is not set, but there are a few parts of tippecanoe that are inherently single-threaded: feature reordering after ingestion and before tiling is limited by I/O speed, and most of processing the z0 tile is a single thread since there is only one tile in the zoom level.

Are there any log messages visible at the point where it is stuck? "Reordering geometry?" "Merging vertices?"

Can you share a copy of the GeoJSON file so I can try to reproduce the problem?

felt / tippecanoe

Low CPU usage for Tippecanoe on EC2 #218