We had an episode where a broadcast was reporting no data health statuses for about 50 minutes before the broadcast was completed/revoked. There is no sign in the logs that the unhealthy state was transitioned into and that we tried to remedy the problem with hardware restarts. We need to check this transition is working correctly.
It has been discovered that this was due to the Issue tally stored in the config not being saved on update i.e. it was at 0 every time the SM was spun up.
We had an episode where a broadcast was reporting no data health statuses for about 50 minutes before the broadcast was completed/revoked. There is no sign in the logs that the unhealthy state was transitioned into and that we tried to remedy the problem with hardware restarts. We need to check this transition is working correctly.