This fixes Cascade Issue #54. In the current Derecho implementation, the external client leaves brutally without notifying the Derecho group member. This causes a command line client (like in Cascade) to freeze when it quickly tries to connect again (e.g. due to the following command invocation).
The solution is a graceful exit procedure for a leaving external client. It not only solves the above issue but also gets rid of the annoying error messages. This PR does the following:
Adds a new JoinRequest type (REMOVE_P2P)
The ExternalGroupClient destructor sends a REMOVE_P2P join request to all nodes for which there is an active p2p connection
The ExternalGroupClient destructor does some cleanup to avoid problems when a new process is initiated and try to use the same resources again
ViewManager has a new upcall (remove_external_connection), which connects with the RPCManager (calls the new method rpc_manager.remove_external_connection)
When a REMOVE_P2P join request is received, RPCManager and ViewManager remove the external node from the P2P connections and SST, which is the same cleanup performed when a failure is detected (which was the old way of cleaning up external clients)
I tested this solution with TCP and RDMA, using the Cascade Python and C clients.
This fixes Cascade Issue #54. In the current Derecho implementation, the external client leaves brutally without notifying the Derecho group member. This causes a command line client (like in Cascade) to freeze when it quickly tries to connect again (e.g. due to the following command invocation).
The solution is a graceful exit procedure for a leaving external client. It not only solves the above issue but also gets rid of the annoying error messages. This PR does the following:
I tested this solution with TCP and RDMA, using the Cascade Python and C clients.