aerospike / aerospike-client-java

Aerospike Java Client Library
Other
236 stars 212 forks source link

Transient errors #119

Closed Aloren closed 5 years ago

Aloren commented 5 years ago

It would be nice to have ability to know which of the errors we get from aerospike client are retryable/transient. Maybe it makes sense to add additional interface RetryableAerospikeException? Or at least it would be very helpful to have some sort of documentation which error codes are retryable and which ones are not. Please take a look at this page.

BrianNichols commented 5 years ago

The only retryable errors are:

1) Client socket timeouts. Total timeouts are not retried. 2) Client connection error. 3) Client socket read/write error. 4) Server timeouts. These are retryable because the server can sometimes (very rare) timeout before the client timeout and a retry can theoretically occur before the client timeout.

Aloren commented 5 years ago

And what about these error codes? These are not timeouts, but for me they look like a temp issue on server:

BrianNichols commented 5 years ago

The retryable errors I mentioned are the errors that the client will automatically retry (assuming maxRetries not exceeded). The user can perform an external retry on any error that they want.

The client does not retry on INVALID_NODE_ERROR, SERVER_ERROR, CLUSTER_KEY_MISMATCH and FAIL_FORBIDDEN because our testing showed the same error will almost always occur again.

DEVICE_OVERLOAD is one error that could be retried with some success, but we leave that up to user to decide.

Aloren commented 5 years ago

The user can perform an external retry on any error that they want.

can, but probably for most cases should not :)

Thank you for the explanation, that is exactly what I wanted to know.