near / borsh-rs

Rust implementation of Binary Object Representation Serializer for Hashing
https://borsh.io/
Apache License 2.0
287 stars 62 forks source link

Support for adding new fields to previously serialized structs #299

Open yanliu38 opened 1 month ago

yanliu38 commented 1 month ago

Hi there,

We have some existing serialized data of something like

use borsh::{BorshDeserialize, BorshSerialize};

#[derive(BorshSerialize, BorshDeserialize)]
struct MyStruct {
    field1: u32,
    field2: String,
}

We would now like to add a new field to the existing struct to have something like:

use borsh::{BorshDeserialize, BorshSerialize};

#[derive(BorshSerialize, BorshDeserialize)]
struct MyStruct {
    field1: u32,
    field2: String,
    field3: Option<u64>,
}

However, when deserializing existing data with the new MyStruct, the following error is returned:

Custom { kind: InvalidData, error: "Unexpected length of input" }

What is the recommended way to add fields to an existing struct, that already has serialized data?

yanliu38 commented 1 month ago

Seems to be related: https://github.com/near/borsh-rs/issues/114

frol commented 1 month ago

Borsh does not include the schema inside the serialized data, so you should plan your upgradability accordingly. If you start from scratch, you have two paths:

  1. keep versioned data structured using enum (enum MyStruct { V1(MyStructV1) }), which will add a one byte tag as part of the borsh serialization for enums, here is an example
  2. plan to keep the data always at the latest state and as such, implement migration logic that would read all the old data using old structs, convert those into the new structs, and write it over using, here is an example

If you already have the data serialized with the old MyStruct, you can follow (2) and consider to implement (1) for seamless operation going forward.

Would you like to contribute a README/docs for this topic? I would be happy to review a PR