download_clone: limit number of retries

mmd-osm commented 2 years ago

Some users try to download a clone db on a rather slow network. In some cases, files on the server will have been deleted long before download_clone.sh finishes the download. This results in scripts running for weeks and months without making any progress except for filling up log files.

download_clone.sh should have a max. number of retries and then simply stop any further downloading attempts.

drolbr commented 2 years ago

Have you been observing that in the wild? There usually are quite a number of other processes with their own timeout weighting in, thus I expect this is close to impossible to happen.

mmd-osm commented 2 years ago

Yes, dev server log files include a download attempt for some 2021-08-18 clone file every 15s. The problem isn't really the wget timeout or any download timeout. It's simply that people start the clone, they may have been able to download say 15 out of 30 files, which already took way too much time. If we then delete old clone files on the server b/c of limited space, the client will be stuck forever in this retry loop:

I have no idea why people aren't monitoring their clone script and stop it at some point... but that's another topic.

retry_fetch_file()
{
  fetch_file "$1" "$2"
  until [[ -s "$2" ]]; do {
    sleep 15
    fetch_file "$1" "$2"
  }; done
};

drolbr commented 2 years ago

Thank you for the details. This makes sense.

drolbr commented 2 years ago

Fixed in ff3718d0438479c301103eebe4a9e1892396ed15

drolbr / Overpass-API

download_clone: limit number of retries #643