cvmfs / cvmfs

The CernVM File System
http://cernvm.cern.ch/portal/filesystem
BSD 3-Clause "New" or "Revised" License
292 stars 131 forks source link

publisher panics if catalog download fails for any reason #2914

Open mharvey-jt opened 2 years ago

mharvey-jt commented 2 years ago

We occasionally observe publisher failure because of transient errors from S3 causing LoadCatalog() to fail. The panic comes from: https://github.com/cvmfs/cvmfs/blob/devel/cvmfs/catalog_mgr_ro.cc#L45-L49

desired behaviour:

This occurs so rarely that we never captured the exact HTTP error leading to the failure, so our expedient fix has been simply to retry LoadCatalog() in a loop at 1 second intervals. I'll not PR that.

jblomer commented 1 year ago

We should use the download manager's own retry logic and check if the defaults are sensible and how they can be changed.

HereThereBeDragons commented 1 year ago

Probably good to integrate it with #3095

jblomer commented 1 year ago

I think we can do this independently from #3095. It's just about using the download manager's retry parameters sensibly.