drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
690 stars 90 forks source link

Improve resilience if diff fetch fails or .osc.gz file is corrupted #731

Open b1tw153 opened 4 weeks ago

b1tw153 commented 4 weeks ago

If the download of a .osc.gz file is interrupted, this can leave an incomplete file in the diff directory. Then, when apply_osc_to_db.sh attempts to use this file, it will hang and fail to complete the update. At this point, the script will output the following error:

gzip: stdin: unexpected end of file
Reading XML file ...Parse error at line NNNN:
unclosed token

To recover from this scenario, the broken .osc.gz must be re-downloaded manually. The fetch_osc.sh script will not automatically replace it. Then, all of the components must be restarted including the dispatcher processes. If apply_osc_to_db.sh is restarted without restarting the dispatcher, it will still hang and fail to complete the update.

Two fixes would be good. The fetch_osc.sh script could have better protection against interruptions. And the apply_osc_to_db.sh script could handle corrupted .osc.gz files better.

b1tw153 commented 4 weeks ago

I think this issue should be reproducible by manually truncating a .osc.gz file before it is processed.