richardartoul / nola

MIT License
74 stars 6 forks source link

Implement Retry Policy and High-Availability Tests #81

Closed aratz-lasa closed 1 year ago

aratz-lasa commented 1 year ago

Description

This PR addresses the issue outlined in #70 by introducing enhancements and fixes to the codebase. It includes changes to improve test coverage and optimize system behavior related to high availability and fault tolerance in the event of server failures.

Code Changes

  1. Test SurviveReplicaFailure: This new test is introduced to validate the high availability and fault tolerance of the system in the event of a server failure. It spawns actors with multiple replicas, simulates a server failure by killing one of the servers, and validates that actor invocations can still succeed despite the loss of a server replica.

  2. environment.invokeReferences(): The documentation for this method is updated to provide a clear understanding of its purpose and behavior. It selects a server for invocation based on the provided references and the create flag. The method returns the index of the reference that was invoked, the response as an io.ReadCloser, and an error if any occurred during the invocation process.

  3. types.ActorOptions.RetryPolicy: The RetryPolicy field is introduced in the ActorOptions struct. It specifies the retry policy for actor invocations when an invocation fails. The possible values are "retry_never" (indicating no retries) and "retry_if_replica_available" (indicating retries on other available replicas).

Additional Notes

70