Error fallback on router for faulty connections

Router continues to send requests to replicas which are proven to be broken. These are orhpan nodes which didn't finish recovery/bootstrap yet, or did finish but with an error and now are broken. It also includes instances who didn't do vshard.storage.cfg, or did but didn't finish yet.

In case of not finished boot all kinds of bad behaviour is possible. The worst ones:

Some of vshard.storage functions are recovered in _func, some are not, so the storage is half-usable;
Some user functions are not recovered yet, so nothing fails right inside of vshard, but fails in user's code.

It seems reasonable to rely on box.info.status ~= 'running' as a sign of the node being not ready to do anything. This can be used right in the storage functions. Once they see the instance is running, the storage can reload itself to a version without these checks (so as not to call the expensive box.info when unnecessary already).

In case the storage functions are not available yet, netbox will return something nasty like:

error: Execute access to function 'test' is denied for user 'guest';
error: Procedure 'test' is not defined.

If encounter these errors for any of vshard.storage functions or vshard.storage functions explicitly return an error about the instance being not 'running', the router must put such connections into a backoff state for some time before retrying. At the same time, the retry to another instance when see any of these errors must be automatic. Regardless of the request mode - read or write. These are not network errors, so can be freely retried.

tarantool / vshard

Error fallback on router for faulty connections #298