near / borsh

Binary Object Representation Serializer for Hashing
https://borsh.io/
507 stars 41 forks source link

Add `#[borsh_optional]` marker for backwards compatibility #99

Open MaksymZavershynskyi opened 4 years ago

MaksymZavershynskyi commented 4 years ago

Motivation

Suppose we have rust structure:

#[derive(BorshSerialize, BorshDeserialize)]
struct A {
  f1: T1,
  f2: T2
}

Suppose we have serialized into some data (e.g. on disk in rocksdb, in contract state, or circulating in network). Then we want to upgrade this structure by adding another field:

#[derive(BorshSerialize, BorshDeserialize)]
struct A {
  f1: T1,
  f2: T2,
  f3: T3
}

It would be extremely convenient for upgradability if we could deserialize old data using new Rust type.

Proposal

We can introduce #[borsh_optional] decorator that can be used like this:

#[derive(BorshSerialize, BorshDeserialize)]
struct A {
  f1: T1,
  f2: T2,
  #[borsh_optional]
  f3: Option<T3>
}

Then when we deserialize old data with this structure f3 will be None, but when we deserialize new data using this structure it will be Some.

It will only work if optional fields are included at the back:

#[derive(BorshSerialize, BorshDeserialize)]
struct A {
  f1: T1,
  f2: T2,
  #[borsh_optional]
  f3: Option<T3>,
  #[borsh_optional]
  f4: Option<T4>,
  #[borsh_optional]
  f5: Option<T5>
}

And the compilation should fail if the following situations:

#[derive(BorshSerialize, BorshDeserialize)]
struct A {
  f1: T1,
  f2: T2,
  #[borsh_optional]
  f3: Option<T3>,
  f4: Option<T4>,
  #[borsh_optional]
  f5: Option<T5>
}

CC @mfornet Since it might be relevant to https://github.com/nearprotocol/NEPs/pull/95

bowenwang1996 commented 4 years ago

This doesn't work very well when the type of f1 needs to change for example. Also this makes it a bit more difficult to distinguish between versions (you have to check whether something is None first).

mfornet commented 4 years ago

I'm not sure if this change can be done in such a way that we can recursively deserialize data structures. For example:


#[derive(BorshSerialize, BorshDeserialize)]
struct A {
    a1: T1,
    #[borsh_optional]
    a2: Option<T2>,
}

#[derive(BorshSerialize, BorshDeserialize)]
struct B {
    b1: T1,
}

#[derive(BorshSerialize, BorshDeserialize)]
struct C {
   a: A,
   b: B,
}

The struct C can't be deserialized using this strategy, so we should enforce a harder constraint: "It will only work if any optional field are included at the back, and any other inner type doesn't use optional". (It can be relaxed a little bit, but not in a way that working with it is easy).

Unfortunately this feature has limited applications to https://github.com/nearprotocol/NEPs/pull/95. In practice most of the changes happens in fields deep inside some data structures or are changes to some variants on enums.

I should update my proposal, but it is more along the lines of introducing new traits BorshSerializeVersioned and BorshDeserializeVersioned that exposes the API:

trait BorshSerializeVersioned {
    fn serialize_with<W: Write>(&self, writer: &mut W, version: u64) -> io::Result<()>;
}

pub trait BorshDeserialize {
    fn deserialize_with(buf: &mut &[u8], version: u64) -> io::Result<Self>;
}