Improve Ra server resilience when log infrastructure encounters faults

Various improvements to data safety when log infrastructure processes encounter faults.

In particular there are many improvements and fixes relating to the server -> wal resend protocol including:

Bug fix to ra_log_cache that would cause most triggered resends result in a ra process crash.
Dropping fewer messages using the gen_state postpone feature.
Ra leaders would previously just exit with wal_down - now they enter the same await_condition state although with a shorter timeout after which the begin a leader transfer process
Improved detection and availability when a command is lost on the way to the wal and no further commands are sent.

Also there is a new feature to configure on a per system basis what kind of server recovery should take place when a ra system starts/restarts. There are 3 options:

undefined : do not restart any ra server
registered: restart all locally registered servers for the system
mfa: call a custom function that performs the restart.

This feature will allow dynamically started ra server to be restarted should the ra system crash and restart.

Also improvements to code coverage and refactoring.

Fixes: https://github.com/rabbitmq/ra/issues/416

rabbitmq / ra

Improve Ra server resilience when log infrastructure encounters faults #428