informalsystems / tendermint-rs

Client libraries for Tendermint/CometBFT in Rust!
Apache License 2.0
591 stars 216 forks source link

Deserialization of `tendermint::block::Header` type fails unexpectedly #1309

Open preston-evans98 opened 1 year ago

preston-evans98 commented 1 year ago

What went wrong?

An attempt to serialize and then deserialize a tendermint::block::Header using a binary data format (postcard) returned an error: Err(DeserializeBadOption).

This error is not a bug in postcard - in my testing also occurs when attempting roundtrip serialization with other binary data formats.

Steps to reproduce

Attempt roundtrip serialization of a tendermint::block::Header with a suitable (non-JSON) data format. The following test case uses tendermint = "0.27" and, serde_json = "1", and postcard = { version = "1", features = ["use-std"]}.

    const HEADER_JSON: &'static str = r#"{
        "version": {
          "block": "11",
          "app": "0"
        },
        "chain_id": "mocha",
        "height": "428545",
        "time": "2023-03-11T15:43:19.088215294Z",
        "last_block_id": {
          "hash": "8FAB396B01B0781B309D2EB438F41FA6A76AA28308AF2FB84D200C756AB48975",
          "part_set_header": {
            "total": 1,
            "hash": "76A87FAAA0D8AF60A6D3DD8DE57FED54E9068B633D11A089ADC38D3C18922741"
          }
        },
        "last_commit_hash": "4FC44DDEF86A36A3AA544F7A913E2F1675A27CEF312E0034747A02D4A560251A",
        "data_hash": "C6FA94EA5B4640A69C830C6A1BB6B86F04800B852F8650B28CDD5CE7E3A307DD",
        "validators_hash": "FA8B443035B476A1D6B704C36CF1460D229D5927BCFD93197E680B2EB63F4568",
        "next_validators_hash": "FA8B443035B476A1D6B704C36CF1460D229D5927BCFD93197E680B2EB63F4568",
        "consensus_hash": "048091BC7DDC283F77BFBF91D73C44DA58C3DF8A9CBC867405D8B7F3DAADA22F",
        "app_hash": "6B4205AE7CEE329C7E49E8C4A102D7B8107CB604AFAA0EE87CFAC1704E8D6461",
        "last_results_hash": "EF4931FB9F6CCCA0C5FE8367BCF5044E785247ADDE925C6FE6B7C200E43EEFA5",
        "evidence_hash": "E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855",
        "proposer_address": "E1570712868BE0B12622BBAE08D96F5840F9D018"
      }"#;

    #[test]
    fn test_tm_header_serde_roundtrip() {
        let header: tendermint::block::Header = serde_json::from_str(HEADER_JSON).unwrap();

        let serialized_header = postcard::to_stdvec(&header).unwrap();
        let _deserialized_header: tendermint::block::Header =
            postcard::from_bytes(&serialized_header).unwrap();
    }

The provided test will fail with the following output:

running 1 test
thread 'test::test_tm_header_serde' panicked at 'called `Result::unwrap()` on an `Err` value: DeserializeBadOption'

Definition of "done"

Either roundtrip serialization succeeds using binary data formats, or documentation is added explaining that serialization to/from formats other than json is not supported.

thanethomson commented 1 year ago

Just to be clear, you're using tendermint-rs v0.27.0, right? If so, have you tried the latest release of tendermint-rs? (v0.31.1)

Asking because we don't have capacity to support older releases of tendermint-rs at present.

preston-evans98 commented 1 year ago

Makes sense. I just tested with v0.31.1 and the issue is still present.

preston-evans98 commented 1 year ago

@thanethomson It looks like tendermint::block::Id is the culprit here. You can easily reproduce the failure using the same HEADER_JSON as above:

    #[test]
    fn test_tm_block_id_serde_roundtrip() {
        let header: tendermint::block::Header = serde_json::from_str(HEADER_JSON).unwrap();
        let last_block_id: tendermint::block::Id = header.last_block_id.unwrap();

        let serialized = postcard::to_stdvec(&last_block_id).unwrap();
        let _deserialized: tendermint::block::Id = postcard::from_bytes(&serialized).unwrap();
    }
tony-iqlusion commented 1 year ago

I think the serde trait impls on all of those types are designed for serializing/deserializing JSON, not arbitrary other formats.

Protobufs are the preferred binary encoding.

preston-evans98 commented 1 year ago

I think the serde trait impls on all of those types are designed for serializing/deserializing JSON, not arbitrary other formats.

Yeah, that seems to be the case. I know that serialization in this package has always been a bit tricky due to compatibility requirements. Just documenting the restriction is probably sufficient for now.

preston-evans98 commented 1 year ago

Will try to put up a PR to fix the documentation and close this issue later. @thanethomson @romac, any thoughts on where the most appropriate place for that documentation would be?

As an aside, it seems like it should be relatively straightforward to use the serde::serializer::is_human_readable method to do some special-casing for JSON. Then serialization might still be broken for non-JSON human-readable types, but binary formats would work as expected - which I think would address most use cases that want serde anyway. Not a priority for now, just leaving this here in case it's useful later.

tony-iqlusion commented 1 year ago

@preston-evans98 but why not use Protobuf as a binary serialization? It's the canonical binary format for these types

preston-evans98 commented 1 year ago

@tarcieri TBC my use case is currently workable without serde, so I'm fine with just documenting the lack of support and closing out the issue.

That being said, there are a few reasons why someone might want to use serde over protobuf. Performance is one - a format like bincode or postcard should be faster to serialize/deserialize in theory (and that's born out in all of the benchmarks I've seen). So if you're storing/retrieving a large volume of this data in a DB, you probably want one of them all else being equal.

Another reason is library support. Lots and lots of packages are compatible with serde and have blanket implementations for their traits in terms of serde. (In fact, that's what prompted this issue - i wanted to use a library and it had default support for serde - so I just used it. But because of the bug, data was getting silently garbled in-flight - so I had to hand write some logic use protobufs instead).

The final reason is just familiarity. Pretty much every Rust programmer has used the Serde APIs, and IME prost is a lot less widely know.

tony-iqlusion commented 1 year ago

A proliferation of binary formats harms interoperability. It already exists with Amino and Protobuf