Open scottyhq opened 4 months ago
Thanks for the report! We attempt to fetch S1 orbit files from the Copernicus Dataspace Ecosystem and the ASF archive at s1qc.asf.alaska.edu. Unfortunately, both of those systems are outage-prone, so we see intermittent OrbitDownloadError
s when fetches from both fail.
ASF has been monitoring the performance of s1qc.asf.alaska.edu for several months. We've observed that generating the index pages for resorb/poeorb files is particularly expensive. This can degrade server performance when many indexing requests come in at the same time (often driven by parallel processing systems like HyP3), resulting in longer response times and increased error rates for all requests to the server (both indexes requests and .EOF download requets).
@bbuechler et. al. implemented caching for the two index pages yesterday. We're hopeful this will eliminate these performance bottlenecks and greatly improve the reliability of orbit fetching from s1qc.asf.alaska.edu. Please let us know if you continue to see these errors at a higher-than-acceptable rate going forward.
FYI for @cmarshak , ASF made a change to s1qc.asf.alaska.edu that we expect will reduce the rate of OrbitDownloadError
s for https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp and https://github.com/dbekaert/RAiDER as well.
@mgovorcin has noticed that our jobs are succeeding at >85% and was able to complete the processing of the CONUS West Coast.
The bug
Occasionally it seems some sort of network error leads to this traceback, which is a bit misleading. The orbit files do exist, the download just failed for some reason.
To Reproduce
It's intermittent, so hard to reproduce. (hyp3-isce2 v1.0.1)
Additional context