ESPRI-Mod / synda

ESGF Downloader (this is a deprecated repository, the tool has now moved to https://github.com/ESGF/esgf-download)
https://espri-mod.github.io/synda/
21 stars 11 forks source link

More fallbacks after a url fails, including different data nodes #90

Closed painter1 closed 5 years ago

painter1 commented 6 years ago

These changes are not ready to be merged, but it works for me.

When Synda cannot get data from the supplied data node with the requested protocol, it can now try other data nodes and protocols. Previously, the only such "fallback" which was supported was a switch from gsiftp to http at the same data node.

The more_fallbacks branch fits my needs but in several respects it is not ready for a general release.

  1. It always prefers gsiftp over http.
  2. The maximum number of data nodes and their relative priorities are hard-coded.
  3. There is a list of failed urls, saved in the database. There may well be a better place for them. For this to work requires the database to have another table; see below. The table should be cleared periodically and isn't yet - once a file is downloaded, the urls for that file are useless; and any url is worth a future retry because the server failure may be temporary. CREATE TABLE failed_url ( url_id INTEGER PRIMARY KEY, url TEXT, file_id INTEGER ); CREATE UNIQUE INDEX idx_failed_url_1 ON failed_url (url);
  4. My emacs couldn't find the specified character set, so I removed the character set specification in each file. Obviously this is not a good long-term solution.
  5. There is more logging than we would want in a completed module.