3Hren / msgpack-rust

MessagePack implementation for Rust / msgpack.org[Rust]
MIT License
1.11k stars 123 forks source link

Deserialize raw bytes into Vec<u8> #249

Open jeromegn opened 4 years ago

jeromegn commented 4 years ago

Maybe I missed it, but I'm unable to deserialize a raw binary value as a Vec<u8>:

    #[test]
    fn some_test() {
        let b = rmp_serde::to_vec(&rmpv::Value::Binary(vec![0u8; 16])).unwrap();
        let v: Vec<u8> = rmp_serde::from_read_ref(&b).unwrap();
    }
running 1 test
thread 'tests::some_test' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax("invalid type: byte array, expected a sequence")', src/lib.rs:503:26
stack backtrace:
< not relevant >

AFAICT, it's the same issue with fixed arrays.

The only work around is to use rmp_serde::Raw, but that seems odd.

chpio commented 4 years ago

i think, that's why serde_bytes exist.

Without specialization, Rust forces Serde to treat &[u8] just like any other slice and Vec just like any other vector. In reality this particular slice and vector can often be serialized and deserialized in a more efficient, compact representation in many formats.

So you're serializing the data as bytes, that is because you're using Value::Binary and its serialize_bytes serialize impl. But then while deserializing you expect it to be a sequence of u8s (caused by the generic deserialize impl on Vec<T>), not bytes, that mismatch is causing the error you're seeing.

The problem you're having has nothing to do with rmp in particular, it's a generic serde problem, which can be workarounded by the mentioned serde_bytes crate.

thedavidmeister commented 4 years ago

would it be possible to use Any to handle Vec<u8> using the serde_bytes approach by default, or even as config in this crate?

https://doc.rust-lang.org/std/any/

i did some benchmarking, round-tripping Vec<u8> with and without serde_bytes through messagepack and found that the difference is huge, so huge that it's kind of dangerous how easy it is to forget to add serde_bytes

i'm seeing about 20 mb/s to round trip the generic way and 1-2 gb/s with serde_bytes

round_trip_bytes/GenericBytesNewType/0                                                                           
                        time:   [133.96 ns 135.28 ns 136.71 ns]
                        thrpt:  [0.0000   B/s 0.0000   B/s 0.0000   B/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
round_trip_bytes/GenericBytesNewType/1                                                                           
                        time:   [209.19 ns 209.51 ns 209.78 ns]
                        thrpt:  [4.5460 MiB/s 4.5519 MiB/s 4.5590 MiB/s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low mild
  1 (10.00%) high severe
Benchmarking round_trip_bytes/GenericBytesNewType/1000: Collecting 10 samples in estimated 5.0023 s (103k iterations                                                                                                                    round_trip_bytes/GenericBytesNewType/1000                        
                        time:   [47.786 us 47.846 us 47.953 us]
                        thrpt:  [19.888 MiB/s 19.932 MiB/s 19.957 MiB/s]
Benchmarking round_trip_bytes/GenericBytesNewType/1000000: Collecting 10 samples in estimated 5.2616 s (110 iteratio                                                                                                                    round_trip_bytes/GenericBytesNewType/1000000                        
                        time:   [46.991 ms 47.062 ms 47.165 ms]
                        thrpt:  [20.220 MiB/s 20.264 MiB/s 20.295 MiB/s]
Benchmarking round_trip_bytes/SpecializedBytesNewType/0: Collecting 10 samples in estimated 5.0000 s (20M iterations                                                                                                                    round_trip_bytes/SpecializedBytesNewType/0                        
                        time:   [215.04 ns 215.77 ns 217.15 ns]
                        thrpt:  [0.0000   B/s 0.0000   B/s 0.0000   B/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe
Benchmarking round_trip_bytes/SpecializedBytesNewType/1: Collecting 10 samples in estimated 5.0000 s (17M iterations                                                                                                                    round_trip_bytes/SpecializedBytesNewType/1                        
                        time:   [230.90 ns 231.50 ns 231.90 ns]
                        thrpt:  [4.1124 MiB/s 4.1195 MiB/s 4.1303 MiB/s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe
Benchmarking round_trip_bytes/SpecializedBytesNewType/1000: Collecting 10 samples in estimated 5.0000 s (12M iterati                                                                                                                    round_trip_bytes/SpecializedBytesNewType/1000                        
                        time:   [364.77 ns 365.29 ns 365.92 ns]
                        thrpt:  [2.5451 GiB/s 2.5496 GiB/s 2.5532 GiB/s]
Benchmarking round_trip_bytes/SpecializedBytesNewType/1000000: Collecting 10 samples in estimated 5.0129 s (6820 ite                                                                                                                    round_trip_bytes/SpecializedBytesNewType/1000000                        
                        time:   [726.08 us 728.63 us 730.26 us]
                        thrpt:  [1.2753 GiB/s 1.2782 GiB/s 1.2827 GiB/s]