MatrixAI / Polykey

Polykey Core Library
https://polykey.com
GNU General Public License v3.0
29 stars 4 forks source link

Client Connection Loss to Agent Exception #276

Closed CMCDragonkai closed 1 week ago

CMCDragonkai commented 2 years ago

Specification

When the PK CLI or GUI loses connection to the agent, there should be a domain level exception in src/client/errors.ts and also a good human readable message.

2 existing exceptions are at play here:

class ErrorGRPCClientTimeout extends ErrorGRPC {}

class ErrorGRPCConnection extends ErrorGRPC {}

These should be caught in the src/client somewhere, and then handled by the src/bin commands.

class ErrorClientClientConnection extends ErrorClient {}

That way we can always know that it's a connection error.

Existing bin scripts already catch the above ErrorGRPCClient... but they should be catching ErrorClientClientConnection. I think both grpc timeouts and grpc connection errors lead to the same result from the client domain perspective.

Additional context

Tasks

  1. ...
  2. ...
  3. ...
CMCDragonkai commented 2 years ago

@joshuakarp should this part of our feature freeze?

joshuakarp commented 2 years ago

Originally I was seeing this as quite low priority, and so not necessary for the feature freeze. But this might actually be quite useful for our own usage of Polykey when deployed on AWS. Thoughts?

CMCDragonkai commented 2 years ago

A quick way to solve this is to move all of our comment descriptions of the exceptions into the actual description property and also use the appropriate exit code from sysexits. We can bring in the sysexits package from npm to do this in a more human readable way.

We just need to add a test into our bin code to test this behaviour, the command it should centralise on is agent status.

tegefaulkes commented 2 years ago

a shutdownCallback was added to GRPCClient that will be called when a connection failure is detected. otherwise an ErrorGRPCClientTimeout error will be emitted if there is a failure to connect or the connection drops and a call is attempted.

https://github.com/MatrixAI/js-polykey/pull/310#issuecomment-1016118180

tegefaulkes commented 2 years ago

Currently NodeConnectionManager wraps the NodeConnection<GRPCClientAgent> and provides withConnection for making calls the the connection/client. if a connection error is emitted by the client then withConnection can re-throw it as a ErrorClientClientConnection error. we may need to provide a similar with pattern for the GRPCClientClient. we should also have the GRPCClientClient make use of the new shutdownCallback to kill itself if the connection fails.

This may look like a with method in GRPCClientClient that exposes all of the GRPC methods in an interface;


@ready(new ErrorClientClientConnection)
public async with<T>(f:(clien: ClientInterface) => Promise<T>) {
    try {
        return f(this as ClientInterface);
    } catch (err) {
        if(err typeof someError) {
            // Was a connection failure
            await this.destroy()
            throw new ErrorClientClientConnection;
        }
        // otherwise 
        throw err
    } 
}
CMCDragonkai commented 2 years ago

I suggest a single GRPC connection error exception that is out into grpc/errors.ts. This is better than creating separate exceptions in client and agent domains. It can be rethrown as NodeConnectionError to be more specific later if need be.

CMCDragonkai commented 2 years ago

Oh wait I re-read the issue and therefore to might make sense to catch and specialise it to client domain error or agent domain error. Because certain ops will do a client call that does a agent call. So being able to specialise ErrorGRPCConnection to ErrorClientConnection and ErrorAgentConnectiom can be useful. This is just pseudo names btw.

CMCDragonkai commented 2 years ago

Can you flesh out the task list in this issue based on what we want to end up doing.

CMCDragonkai commented 2 years ago

310 enables the ability to resolve this issue, but this would have to be addressed in a later PR following https://github.com/MatrixAI/js-polykey/issues/276#issuecomment-1017027663

CMCDragonkai commented 2 years ago

The GRPCClient is now exposing the destroyCallback which is being used by NodeConnection. Similarly the GRPCClientClient will be destroyed if the underlying GRPC channel is shutdown.

This means any usage of GRPCClientClient should hit the destruction exception ErrorClientClientDestroyed if it tries to do another call on it.

However we may want to propagate it up all the way to the PolykeyClient class itself, and perform a destruction of that instance too. Which should then propagate to the CLI when it tries to perform a method call on the pkClient.

CMCDragonkai commented 1 week ago

The original spec relies alot on the way GRPC worked. GRPC isn't relevant anymore as we changed to using js-rpc on https://github.com/MatrixAI/Polykey/tree/4ddd9dd726995e4431a09b26b764f5d53fd5e4ac (October the 6th 2023).

However having good errors is still a good idea, so we should verify this behaviour later. @tegefaulkes

4ddd9dd726995e4431a09b26b764f5d53fd5e4ac