near / borsh-js

TypeScript/JavaScript implementation of Binary Object Representation Serializer for Hashing
Apache License 2.0
112 stars 38 forks source link

Deserialize requires type assertion #71

Closed jnsvd closed 8 months ago

jnsvd commented 11 months ago

The deserialize method is not implemented as documented with a Class type parameter:

deserialize(schema: Schema, buffer: Uint8Array, class?: Class): any

Instead, it is:

deserialize(schema: Schema, buffer: Uint8Array, validate: boolean): DecodeTypes

Since there is no link to the type implementing Schema, a type assertion is required to assign deserialized values to custom types:

let data = ...;

type MyType = {
    value: string;
};
const MyTypeSchema: Schema = {
  struct: {
    value: "string"
  }
};

// No explicit type
const myTypeInstance_1 = borsh.deserialize(MyTypeSchema, data);
myTypeInstance_1.value; // Unresolved variable value 

// Use DecodeTypes
const myTypeInstance_2: DecodeTypes = borsh.deserialize(MyTypeSchema, data);
myTypeInstance_2.value; // Unresolved variable value

// Use MyType without type assertion
const myTypeInstance_3: MyType = borsh.deserialize(MyTypeSchema, data); // Initializer type DecodeTypes is not assignable to variable type MyType 

// Use type assertion
const myTypeInstance_4: MyType = <MyType>borsh.deserialize(MyTypeSchema, data);
myTypeInstance_4.value; // OK

Is this the way the library is supposed to be used? In my opinion the previous interface was cleaner. In any case, the README file is not in line with the recent changes.

gagdiez commented 10 months ago

Hi @jnsvd, thanks for flagging the issue. Indeed, there is an error in the README, it is fixed now.

I see your point about the deserializing method, and I think it is valid. We could actually change the deserialize interface to be: deserialize<T>(...): T and that would not change the current behaviour for our JS users.

It would change for the TS users, since now deserialize would return an unknown if no T is given. This is a breaking change, but how many people is really using DecodeTypes as a result? I assume they are always casting the type to their T anyways.

I would not want to go back to the old deserialize(..., class?: Class), since that would imply having to handle a class? parameter within the method, which would just complicate the code without adding much to the actual use case.

what do you think?

cc @ailisp

gagdiez commented 10 months ago

@jnsvd @ailisp actually, while testing the new implementation, I remembered why we did not do this!

Casting does not solve the actual problem of transforming the deserialized object into the expected one. For example, if you have a structure with an array of Uint8Arrays:

u8arrays: Uint8Array[]

borsh will deserialize it into

u8arrays: number[][]

and the casting will not fix that (i.e. the casting will not transform number[][] into Uint8Array[].

Before, the code was taking the class directly to infer types, i.e. decode(..., A) which I think solves the above but brings other bugs that are more hideous to fix.

Particularly, there were scenarios in which decode(..., v1.A) was giving creating objects of type A that were not compatible with decode(..., v2.A), even when the interface of both v1.A & v2.A was the same, see this issue.

In this way, the new implementation

  1. Allows to decode equivalent interfaces with different className in a consistent way
  2. The inner code is simpler to maintain

The trade-off is:

  1. It costs an extra cast to the user for simple cases.
  2. In complex cases, the user needs to re-build the object from the deserialisation.

We believe that the benefit of (1) outweighs the problem of (3) and (4), since anyway the method's signature is explicit about it (i.e. it tells you that is returning a DecodeType, and not a specific class).