Closed CMCDragonkai closed 2 years ago
In doing this, I discovered that our toJSON
method is actually buggy in PK. It is now fixed here and will be used.
Maybe this impacts GRPC error serialisation.
Furthermore I'm playing around with error chaining types to see if it's possible to have type-guided error chains which will help for debugging.
This will require js-errors
onto the gitlab CI/CD.
This has made the ErrorChain
into a generic type instead. This enables some nice way of allowing TS to "guide" the usage of the cause
. Thus a deep exception chain can result in a deep type. Like AbstractError<AbstractError<Error>>
.
One of the important functions would be a static fromJSON
method.
This would follow our js-id
style where we take any
and convert it back to JSON
as per the format of toJSON
.
Note that when using toJSON
we actually produced something like:
function toJSON(id: IdInternal): { type: string; data: Array<number> } {
return {
type: IdInternal.name,
data: [...id],
};
}
Thus producing a very "structured" JSON version of the exception. Unlike our current toJSON
which just produces a POJO of all the properties.
Since we may expect to serialise exceptions over the network, we would want something to do this.
However there's a little challenge here. Because the cause
itself may contain almost anything, there's no guarantee that when calling fromJSON
, that we could "reconstruct" the original object graph.
This is discussed in this comment https://github.com/MatrixAI/js-polykey/issues/304#issuecomment-1028561787
It was explained that GRPC metadata doesn't have "nested" data.
However all the fields are "multivalues". Thus it is possible to do:
const errorName = e.metadata.get('name')[0] as string;
const errorMessage = e.metadata.get('message')[0] as string;
const errorData = e.metadata.get('data')[0] as string;
Deserialising from this would not be using the fromJSON
as this would a custom format just for GRPC metadata.
However an alternative is to avoid relying on the metadata format, and directly set the stringified JSON in just one property. Which gives us some flexibility at the cost of some lower transparency in the GRPC metadata (any reaction to errors, has to involve potentially parsing the JSON first).
The grpcUtils.toError
is already used by all the generator wrapper utilities in grpc/utils
. This throws any GRPC ServiceError
as a domain specific error in Polykey.
During serialisation, we can produce a structured "{ type, data }" format that is used by js-id, so that we have a similar kind of "JSON encoded object" format where the type
is always the class name, and the data
is the the data required for the fromJSON
call.
Our GRPC metadata properties can mirror some of the important data from these exception objects. But there should be an error
property that actually contains the JSON exception information. e.metadata.get('error')
.
The current GRPC properties we are using is:
metadata.set('name', error.name);
metadata.set('message', error.message);
metadata.set('data', JSON.stringify((error as errors.ErrorPolykey).data));
We know that GRPC also sets additional properties whenever a ServerStatusResponse
object is emitted on an error
event.
export interface StatusObject {
code: Status;
details: string;
metadata: Metadata;
}
export declare enum Status {
OK = 0,
CANCELLED = 1,
UNKNOWN = 2,
INVALID_ARGUMENT = 3,
DEADLINE_EXCEEDED = 4,
NOT_FOUND = 5,
ALREADY_EXISTS = 6,
PERMISSION_DENIED = 7,
RESOURCE_EXHAUSTED = 8,
FAILED_PRECONDITION = 9,
ABORTED = 10,
OUT_OF_RANGE = 11,
UNIMPLEMENTED = 12,
INTERNAL = 13,
UNAVAILABLE = 14,
DATA_LOSS = 15,
UNAUTHENTICATED = 16
}
When we are throwing our application exceptions, we are using the UNKNOWN
code, this is just because that's the default code to be used when additional details aren't used in the toError(): ServerStatusResponse
. And according to https://grpc.github.io/grpc/core/md_doc_statuscodes.html it should be used for application errors. But there may be some overlap between application errors and these traditional errors. We may want to make use of some of these status codes like NOT_FOUND
when that's what it means.
Suppose all 3 metadata properties was reduced to just:
class ErrorSomething extends AbstractError {}
const e = new ErrorSomething();
metadata.set('error', JSON.stringify(e));
// we know the data will look like this
// {
// type: 'ErrorSomething',
// data: {
// ...,
// cause: {
// type: 'AnotherError',
// data: { ... }
// },
// ...
// }
// }
On the other side, we acquire the error
property like this:
// e.code === grpc.status['UNKNOWN']
// e.details === ??? - not sure what this would be, maybe an associated message for `UNKNOWN` status
errorsUtils.fromJSON(e.metadata.get('error')[0] as string)
Now the errorsUtils.fromJSON
or AbstractError.fromJSON
has do some complicated work.
It has to take the type
field, and then try to "reconstruct" the object.
In js-id
this is doable, cause we are always using IdInternal
, the same class.
In this case, AbstractError
is supposed to be used like an abstract class, and must construct based on a name. However without reference to the class structure, this is not possible. Furthermore, we cannot allow just arbitrary strings to be evaluated. It must be constructed with respect to a "restricted list" of classes.
This is what we currently do in toError
where it:
return new errors[errorName](errorMessage, JSON.parse(errorData));
This only works because errors
is a module being exposed like an object, and we use it as a lookup table for errorName
, to actually get a reference for the class to reconstruct.
Let me prototype a static function that gets inherited and uses the internal class name. We may be able to work something out, if we can expect the same class wrapper at the root of the exception structure.
Any types that are not valid exceptions, would then need to be converted to a "base" exception that gets used. We already do something like this in grpcUtils.toError
:
return new grpcErrors.ErrorGRPCClientCall(e.message, {
code: e.code,
details: e.details,
metadata: e.metadata.getMap(),
});
That acts as the base during recursive traversal of the JSON structure.
Because causes can be arbitrary, this creates a form of open-extensibility for the cause chain, which means that encoding and decoding to and from JSON has to deal with arbitrary data types.
To deal with this, I suspect a combination of JSON's replacer and reviver mechanisms have to be used to work against a select set of types.
In this library alone, the only types we are aware of are AbstractError
and Error
and descendants of Error
like SyntaxError
... etc.
During encoding, instances of Error
does not have toJSON
methods. This means JSON.stringify(new Error())
just gives back {}
.
We can at least resolve this issue by dealing with this in our toJSON
method.
It also seems that we may provide reviver
or replacer
utilities. But again we don't know the right types too.
Alternatively the cause
chain can be restricted rather than allowing any arbitrary type. What would this restriction be?
Seems like it could be limited to just instances of Error
and instances of AbstractError
. During encoding, we can know that all Error
instances can have name
, message
and stack
, and associated properties with AbstractError
, but we still won't be able to restore more specific types than AbstractError
.
So at the end of the day, the user must supply the appropriate reviver or replacer. This would be case with PK.
Additional requirement coming from https://github.com/MatrixAI/js-polykey/pull/321#issuecomment-1028554814
I realised that when converting GRPC errors back into exceptions, there's no indication that this is coming from another node/call, rather than being raised internally. The only data we have is error message and error data.
Consider when we make a function call to a particular domain like VaultManager
. The VaultManager
may end up calling NodeConnectionManager
and NodeConnection
and subsequently RPC the other node. When the other node receives it, its handler may end up calling VaultManager
as well. At any point here there may be exceptions that are thrown. It makes sense, that we should differentiate exceptions that were thrown from our own node code, or from the other side over a GRPC call.
One way to differentiate this is to augment the exception with additional details in the data
property either when it is sent over the GRPC metadata, or when it is received. Additionally a different class can be used to wrap exceptions that are being sent over the GRPC metadata. If details are always expected to be added on the sending side, this can be problematic because if the sender does something unexpected, this can affect the receiver. Instead it should be the receiver's responsibility upon receiving the exception to decorate/augment the exception appropriately that it came from the other side. Note that receiver here just means whichever side receives the exception, not necessarily who started a GRPC call.
This means grpcUtils.toError
would need to appropriately wrap the exceptions from ServiceError
object to errors.ErrorPolykeyRemote
(this was suggested as the special class).
This new exception can then have in its cause
chain pointing to the reconstructed exceptional objects that came in from serviceError.metadata.get('error')[0]
.
It may not actually be necessary to reconstruct all exception objects in the cause
chain. In fact there's no guarantee this would even be possible because the sender may have classes that does not exist on the receiver.
This can be addressed by have a base exception class indicating unknown
exception that it decodes to, or it can be dealt with by not decoding those JSON exceptions and terminating the recursive decoding there.
Additionally we should also differentiate client to agent calls, and agent to agent calls, so a single ErrorPolykeyRemote
would only work if the nodeId
was a property that we would know where the error came from. The PK node that we just called, or another PK node that our PK node called.
import { AbstractError } from '@matrixai/errors';
import sysexits from './utils/sysexits';
// exitCode is not part of `AbstractError` since it is specific to how PK uses it
class ErrorPolykey extends AbstractError {
static description = 'Polykey error';
exitCode: number = sysexits.GENERAL;
}
// here is our new wrappers intended to be used by `grpcUtils.toError` when receiving error over GRPC call
// Client to Agent calls and Agent to Agent Calls
// use the nodeId property to differentiate between own agent and foreign agent
class ErrorPolykeyRemote extends ErrorPolykey {
static description = 'Remote error from RPC call';
exitCode = sysexits.UNAVAILABLE;
nodeId: NodeId;
nodeAddress: NodeAddress;
}
// the exit code for `ErrorPolykeyRemote` may actually depend on what it is wrapping though
The grpcUtils.fromError
must strip debug information when sending exceptions over the network to untrusted source.
This should be done for client to agent and agent to agent.
We now have output format options in AbstractError.toJSON
that allows one to control what gets outputted during serialisation to JSON. Note that by not providing the cause
, deeper properties won't be provided. But if the cause
is also provided, then subsequent cause may also be provided.
That would mean one cannot just use JSON.stringify
to auto-call the exception object, but instead manually call toJSON()
first then use JSON.stringify
afterwards.
Alternatively the replacer
callback can be used here to help specialise what gets serialised.
Important details like stack
and some limitation of the cause
chain may be needed to be stripped.
As an example rather than JSON.stringify(new AbstractError)
.
You do:
const e = new AbstractError();
JSON.stringify(e.toJSON(undefined, { stack: false });
That ensures that the exception being stringified is already being filtered, in this case the stack
property is not provided.
It turns out that as soon as there's a toJSON
in the object, it is called before the replacer is called. But this is also true recursively. Thus our custom replacer here has no need to check if it is an instanceof AbstractError
as this condition will never be met. The toJSON
would already be called.
I just discovered that JS Error
now supports cause
property: https://v8.dev/features/error-cause
It's also available https://stackoverflow.com/questions/22303633/set-error-cause-in-javascript-node-js Node.js (16.9+). We aren't on that version just yet.
This is great, the design basically matches the exact API I have. And this should also mean that we can now expect a causation chain even for normal Error
instances.
For now I can't use it mainly due to:
{ cause: ... }
options parameter for CustomError
.Going to put this here for later, since I want to try out error integration first.
I'm not sure if js-errors
should be providing a generic way of encoding/decoding the base errors. This seems like an application concern. And therefore even the toJSON
may not entirely be relevant here.
Until we have figured out a good pattern for this. I suspect the toJSON
can only be used at one level. Any further causes will be further specialised downwards. It's complicated by the fact that you may not even want to provide properties of the errors in different serialisation situations. So we shall do it in PK.
The stringify
is top-down recursively (zooming-in), any object you return is further iterated over. The parse
is bottom-up recursively, any object you return is further iterated in the higher context (zooming-out).
Also test details...
Description
Issues Fixed
Tasks
AbstractError
as the global class to be inherited fromcause
property that allows one to determine an "exception chain"timestamp: Date
property that is weakly monotonic.toJSON
method to not actually useJSON.stringify
, it should be returning the POJO that will be used byJSON
Final checklist