protomaps / go-pmtiles

Single-file executable tool for working with PMTiles archives
BSD 3-Clause "New" or "Revised" License
356 stars 49 forks source link

Extract region forcibly closed by remote host #102

Closed maphew closed 9 months ago

maphew commented 10 months ago

When downloading a region from latest protomaps build I got the errors below. On the 3rd retry it was finally successful.

CMD T:\ENV.608\output
» pmtiles extract https://build.protomaps.com/20231113.pmtiles yt_protomaps-20231113.pmtiles --region=yt_aoi_1000k_wgs84.geojson
fetching 118 dirs, 118 chunks, 16 requests
Region tiles 19366132, result tile entries 4146674
fetching 4146674 tiles, 3159 chunks, 738 requests
fetching chunks  43% |██████████████████████████                                    | (726 MB/1.6 GB, 791 kB/s) [3m7s:20m25s]
2023/11/14 12:36:13 main.go:129: Failed to extract, read tcp 10.128.64.215:58881->104.26.2.120:443: wsarecv: An existing connection was forcibly closed by the remote host.

CMD T:\ENV.608\output
» z:\tools\bin\pmtiles extract   https://build.protomaps.com/20231113.pmtiles   --region=yt_aoi_1000k_wgs84.geojson   yt_protomaps-20231113.pmtiles
fetching 118 dirs, 118 chunks, 16 requests
Region tiles 19366132, result tile entries 4146674
fetching 4146674 tiles, 3159 chunks, 738 requests
fetching chunks  92% |███████████████████████████████████████████████████████████      | (1.5/1.6 GB, 207 kB/s) [2m22s:9m55s]
2023/11/14 12:41:10 main.go:129: Failed to extract, stream error: stream ID 41; INTERNAL_ERROR; received from peer

CMD T:\ENV.608\output
» z:\tools\bin\pmtiles extract   https://build.protomaps.com/20231113.pmtiles   --region=yt_aoi_1000k_wgs84.geojson   yt_protomaps-20231113.pmtiles
fetching 118 dirs, 118 chunks, 16 requests
Region tiles 19366132, result tile entries 4146674
fetching 4146674 tiles, 3159 chunks, 738 requests
fetching chunks 100% |██████████████████████████████████████████████████████████████████████| (1.6/1.6 GB, 14 MB/s)
Completed in 2m14.9249596s with 4 download threads (30733.186893613125 tiles/s).
Extract required 757 total requests.
Extract transferred 1.8 GB (overfetch 0.05) for an archive size of 1.7 GB

pmtiles 1.11.1, commit 376ea25828660a3736af545d4fc4966cd19f21d5, built at 2023-11-12T08:45:19Z
Windows 10 Enterprise.

yt_aoi_1000k_wgs84.json

bdon commented 10 months ago

Can you try setting --download-threads=1 and see if it completes with no issues, or any lower number than 4?

maphew commented 10 months ago

1 thread worked, though the last percent was painful to watch, with transfer speed dropping to 300-350 B/s for the last minute or so.

» z:\tools\bin\pmtiles extract   https://build.protomaps.com/20231113.pmtiles   --region=yt_aoi_1000k_wgs84.geojson   yt_protomaps-20231113.pmtiles   --download-threads=1
fetching 118 dirs, 118 chunks, 16 requests
Region tiles 19366132, result tile entries 4146674
fetching 4146674 tiles, 3159 chunks, 738 requests
fetching chunks 100% |████████████████████████████████████████████████████████████████████| (1.6/1.6 GB, 4.6 MB/s)
Completed in 6m16.4022314s with 1 download threads (11016.603128458499 tiles/s).
Extract required 757 total requests.
Extract transferred 1.8 GB (overfetch 0.05) for an archive size of 1.7 GB

5 threads started at about 20 MB/s, peaked at 25, dropped to 10 at 97% and steadily decreased from there. It was successful first try.

Completed in 2m23.2003127s with 5 download threads (28957.157437827296 tiles/s).

Today no threads parameter showed similar pattern to 5, steadily decreasing speed after 97%, and today completed on first attempt. Only 7 seconds longer than 5 threads (2m30s).

10 threads peaked at 32 MB/s, averaging 20, finishing successfully in 1m55s.

I don' t know if any of this is useful, but am happy to do more tests.

bdon commented 10 months ago

Thanks for the detailed report. Are you able to reproduce the original problem? Maybe it was an intermittent thing related to the network or cloudflare storage (I didn't see CF report any downtime though)

Yes, at the end of the extract it will get slower because it fetches chunks from largest to smallest. The small chunks have more overhead of waiting for a request relative to the bytes downloaded. If you have 2 or more threads, the 2nd thread will work from the end of the queue to speed it up.

maphew commented 9 months ago

not reproduced, though I only tried a couple more times. closing until such a time as it becomes relevant again.