toby / serde-bencode

Serde backed Bencode encoding/decoding library for Rust.
MIT License
65 stars 16 forks source link

RawValue as a type for deserializing #20

Open letFunny opened 4 years ago

letFunny commented 4 years ago

What do you think of a RawValue type like in serde_json?

Use case

I want to use it to deserialize a Torrent struct without having to declare all the fields. I only care for the Infohash so I only want the name of the torrent bdecoded and the raw encoded Infohash. I think this can be an enhancement over declaring all the complex structs and then encoding again.

Problems

At the top of my head we have two problems. First of all is that bencoding lacks the structure necessary to just crop the string without context. If we encode a struct with 3 integers: a:1, b:2 and c:3; we get "d1:ai1e1:bi2e1:ci3ee". If I wanted to leave the "c" field raw I cannot just crop the string like "1:ci3ee" because that is not valid bencoding.

Quick solution

Off the top of my head, we could just create a new type that decoded the data to any format without caring what it is and then encodes it to a bytestring.

#[derive(Debug)]
pub struct RawValue(ByteBuf);

impl<'de> Deserialize<'de> for RawValue {
    fn deserialize<D>(deserializer: D) -> Result<RawValue, D::Error>
        where D: Deserializer<'de>
    {
        struct RawValueVisitor;
        impl<'de> Visitor<'de> for RawValueVisitor {
            type Value = RawValue;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                write!(formatter, "any valid bencoded value")
            }
            fn visit_i64<E>(self, value: i64) -> Result<Self::Value, E>
            where
                E: serde::de::Error,
            {
                let bytes = to_bytes(&value);
                match bytes {
                    Ok(b) => Ok(RawValue(ByteBuf::from(b))),
                    Err(_) => panic!("Unknown error during serialization and deserialization of RawValue"),
                }
            }
        }

        let value = deserializer.deserialize_any(RawValueVisitor)?;
        Ok(value)
    }
}

I have only implemented the visitor for i64 and I would have to implement all the others in the same manner.

I also tried to have deserialize into a generic type without caring what it is (only that is Serializable):

let value: Box<dyn Serialize> = Box::new(Deserialize::deserialize(deserializer)?);

But it does not work and I cannot figure out how can I use generics to get it working because the trait cannot be made into an object.

I don't know enough about serde and rust to figure out a better solution so, what do you think?

josecelano commented 1 year ago

Hi @letFunny I've just created a sample repo which contains a function to extract the infohash from the torrent bytes:

https://github.com/torrust/torrust-parse-torrent/blob/main/src/utils/parse_torrent.rs#L54-L69

pub fn calculate_info_hash(bytes: &[u8]) -> InfoHash {
    // Extract the info dictionary
    let metainfo: MetainfoFile = serde_bencode::from_bytes(bytes)
        .expect("Torrent file cannot be parsed from bencoded format");

    // Bencode the info dictionary
    let info_dict_bytes =
        serde_bencode::to_bytes(&metainfo.info).expect("Info dictionary cannot by bencoded");

    // Calculate the SHA-1 hash of the bencoded info dictionary
    let mut hasher = Sha1::new();
    hasher.update(&info_dict_bytes);
    let result = hasher.finalize();

    InfoHash::from_bytes(&result)
}

I'm reusing code from the project Torrust Index Backend.

I hope that helps.