birkenfeld / serde-pickle

Rust (de)serialization for the Python pickle format.
Apache License 2.0
188 stars 28 forks source link

Can't decode byte strings as Vec<u8> #2

Closed vorner closed 7 years ago

vorner commented 7 years ago

As python2 uses the same data type to store strings and binary blobs (or binary strings), it is unfortunate the binary blobs can't be reasonably decoded. I have this program in python2 (let's assume the string world actually might contain something that is not utf8).

import pickle

data = {
    "hello": b"world",
}

with open("data.pickle", "wb") as f:
    pickle.dump(data, f)

Now, if I use this rust program to decode it:

extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_pickle;

use std::fs::File;
use serde_pickle::from_reader;

#[derive(Debug, Deserialize)]
struct Data {
    hello: String,
}

fn main() {
    let data: Data = from_reader(File::open("data.pickle").unwrap()).unwrap();
    println!("{:?}", data);
}

it works, except that I run the risk of not decoding the message if there's something that is not valid utf8. But if I try to decode it into Vec<u8>, I get this error instead:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(Structure("invalid type: byte array, expected a sequence"))', /checkout/src/libcore/result.rs:860

I believe a byte array is a sequence, kind of? And the documentation suggests this should be supported: „Strings (Rust Vec)“.

birkenfeld commented 7 years ago

This was actually the same bug as #1. Will work fine in 0.4.0. Thanks for the reports, and sorry for the delay in fixing.