HERA-Team / hera_corr_f

HERA F-Engine on SNAP
1 stars 5 forks source link

Overly Broad Exception in the snap monitor #95

Open mkolopanis opened 2 years ago

mkolopanis commented 2 years ago

Software call 03/28/2022:

The hera-snap-redis-monitor has again been getting into a state where the script is operable but does not function. Fengs will have no attributes like pam or fail on any read but the monitor continues to try. A restart usually fixes these issues but the current daemon will not exit/error. The daemon can be configured to restart on an error but currently will continue to "run" but do nothing.

Concerned about the broadness of the linked exception. We definitely want to log communication issues (e.g. TFTP errors) and continue because these might be recoverable; however Python errors should not be ignored.

For instance, if a SnapFengine object doesn't have a particular attribute like the pam block, we should error. This is a symptom of an uninitialized or poorly initialized object.

It might be better to have either a list of approved errors to accept or a list to always propagate the error. Consensus is to have a list of approved errors which will not inhibit further functionality (e.g. print and keep going on TFTP errors but raise if not a TFTP error).

https://github.com/HERA-Team/hera_corr_f/blob/c9310d48a353cfd42156fe7b56d8b85c9ff2f8e3/control_software/scripts/hera_snap_redis_monitor.py#L139-L142