alloy-rs / core

High-performance, well-tested & documented core libraries for Ethereum, in Rust
https://alloy.rs
Apache License 2.0
763 stars 137 forks source link

[Question] How to “pretty easily” implement (De)/serialize for DynSolValue with serde(with=…) #614

Closed wtdcode closed 4 months ago

wtdcode commented 4 months ago

Component

dyn-abi

Describe the feature you would like

As title, I want to know how to “pretty easily” do this? Is there any one-line solution with this?

Additional context

612

https://github.com/gakonst/ethers-rs/issues/2667

I might be dumb at serde but I think it is offensive to mark the issue “pretty easy” and close it without any concrete solution. Therefore, I reopen one to ask for an “easy” solution cuz I can’t get why it is “easy” compared to deriving the attributes with two more english words.

wtdcode commented 4 months ago

Two cents: Alloy claims it follows rust code of conduct. So I expect:

Please keep unstructured critique to a minimum. If you have solid ideas you want to experiment with, make a fork and see how it works.

prestwich commented 4 months ago

As a more complete example, I made you a struct following @DaniPopes recommendation to serialize the type alongside the value. This is strictly better than the requested serde_with, as it ensures that type info can never be lost (as serializing a DynSolValue without its DynSolType can. Please note that when you use this, you are implementing an application-specific serialization format that no other tool will support

Please do not keep opening new issues or PRs for this.

use alloy_dyn_abi::{DynSolType, DynSolValue};
use alloy_primitives::Bytes;
use serde::{
    de::{MapAccess, Visitor},
    ser::SerializeMap,
    Deserialize, Deserializer, Serialize, Serializer,
};

#[derive(Debug)]
struct TypedValue {
    ty: DynSolType,
    value: DynSolValue,
}

impl Serialize for TypedValue {
    fn serialize<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
        let mut map = s.serialize_map(Some(2))?;
        map.serialize_entry("type", &self.ty.sol_type_name())?;
        map.serialize_entry("value", &self.value.abi_encode())?;
        map.end()
    }
}

impl<'de> Deserialize<'de> for TypedValue {
    fn deserialize<D: Deserializer<'de>>(d: D) -> Result<Self, D::Error> {
        struct TypedValueVisitor;

        impl<'de> Visitor<'de> for TypedValueVisitor {
            type Value = TypedValue;

            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                formatter.write_str("a map containing a type string and an encoded value")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: MapAccess<'de>,
            {
                let mut ty: Option<String> = None;
                let mut value: Option<Bytes> = None;
                while let Some(key) = map.next_key()? {
                    match key {
                        "type" => {
                            if ty.is_some() {
                                return Err(serde::de::Error::duplicate_field("type"));
                            }
                            ty = Some(map.next_value()?);
                        }
                        "value" => {
                            if value.is_some() {
                                return Err(serde::de::Error::duplicate_field("value"));
                            }
                            value = Some(map.next_value()?);
                        }
                        _ => {
                            return Err(serde::de::Error::unknown_field(key, &["type", "value"]));
                        }
                    }
                }
                let ty: DynSolType = ty
                    .ok_or_else(|| serde::de::Error::missing_field("type"))?
                    .parse()
                    .map_err(serde::de::Error::custom)?;
                let value = value.ok_or_else(|| serde::de::Error::missing_field("value"))?;
                let value = ty.abi_decode(&value).map_err(serde::de::Error::custom)?;

                Ok(TypedValue { ty, value })
            }
        }

        d.deserialize_map(TypedValueVisitor)
    }
}

#[cfg(test)]
mod test {
    use super::*;
    use alloy_primitives::U256;

    #[test]
    fn roundtrip() {
        let ty = "uint256".parse().unwrap();
        let value = U256::ZERO.into();
        let typed_value = TypedValue { ty, value };

        let serialized = serde_json::to_string(&typed_value).unwrap();
        let deserialized: TypedValue = serde_json::from_str(&serialized).unwrap();

        assert_eq!(typed_value.ty, deserialized.ty);
        assert_eq!(typed_value.value, deserialized.value);
    }

    #[test]
    fn complex() {
        let ty = "(uint256,bytes15,bytes)".parse().unwrap();
        let value = DynSolValue::Tuple(
            vec![
                U256::from(42).into(),
                DynSolValue::FixedBytes([0; 32].into(), 15),
                DynSolValue::Bytes((0..244).into_iter().collect()),
            ]
            .into(),
        );

        let typed_value = TypedValue { ty, value };

        let serialized = serde_json::to_string(&typed_value).unwrap();
        let deserialized: TypedValue = serde_json::from_str(&serialized).unwrap();

        assert_eq!(typed_value.ty, deserialized.ty);
        assert_eq!(typed_value.value, deserialized.value);
    }
}
wtdcode commented 4 months ago

As a more complete example, I made you a struct following @DaniPopes recommendation to serialize the type alongside the value. This is strictly better than the requested serde_with, as it ensures that type info can never be lost (as serializing a DynSolValue without its DynSolType can. Please note that when you use this, you are implementing an application-specific serialization format that no other tool will support

Please do not keep opening new issues or PRs for this.

use alloy_dyn_abi::{DynSolType, DynSolValue};
use alloy_primitives::Bytes;
use serde::{
    de::{MapAccess, Visitor},
    ser::SerializeMap,
    Deserialize, Deserializer, Serialize, Serializer,
};

#[derive(Debug)]
struct TypedValue {
    ty: DynSolType,
    value: DynSolValue,
}

impl Serialize for TypedValue {
    fn serialize<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
        let mut map = s.serialize_map(Some(2))?;
        map.serialize_entry("type", &self.ty.sol_type_name())?;
        map.serialize_entry("value", &self.value.abi_encode())?;
        map.end()
    }
}

impl<'de> Deserialize<'de> for TypedValue {
    fn deserialize<D: Deserializer<'de>>(d: D) -> Result<Self, D::Error> {
        struct TypedValueVisitor;

        impl<'de> Visitor<'de> for TypedValueVisitor {
            type Value = TypedValue;

            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                formatter.write_str("a map containing a type string and an encoded value")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: MapAccess<'de>,
            {
                let mut ty: Option<String> = None;
                let mut value: Option<Bytes> = None;
                while let Some(key) = map.next_key()? {
                    match key {
                        "type" => {
                            if ty.is_some() {
                                return Err(serde::de::Error::duplicate_field("type"));
                            }
                            ty = Some(map.next_value()?);
                        }
                        "value" => {
                            if value.is_some() {
                                return Err(serde::de::Error::duplicate_field("value"));
                            }
                            value = Some(map.next_value()?);
                        }
                        _ => {
                            return Err(serde::de::Error::unknown_field(key, &["type", "value"]));
                        }
                    }
                }
                let ty: DynSolType = ty
                    .ok_or_else(|| serde::de::Error::missing_field("type"))?
                    .parse()
                    .map_err(serde::de::Error::custom)?;
                let value = value.ok_or_else(|| serde::de::Error::missing_field("value"))?;
                let value = ty.abi_decode(&value).map_err(serde::de::Error::custom)?;

                Ok(TypedValue { ty, value })
            }
        }

        d.deserialize_map(TypedValueVisitor)
    }
}

#[cfg(test)]
mod test {
    use super::*;
    use alloy_primitives::U256;

    #[test]
    fn roundtrip() {
        let ty = "uint256".parse().unwrap();
        let value = U256::ZERO.into();
        let typed_value = TypedValue { ty, value };

        let serialized = serde_json::to_string(&typed_value).unwrap();
        let deserialized: TypedValue = serde_json::from_str(&serialized).unwrap();

        assert_eq!(typed_value.ty, deserialized.ty);
        assert_eq!(typed_value.value, deserialized.value);
    }

    #[test]
    fn complex() {
        let ty = "(uint256,bytes15,bytes)".parse().unwrap();
        let value = DynSolValue::Tuple(
            vec![
                U256::from(42).into(),
                DynSolValue::FixedBytes([0; 32].into(), 15),
                DynSolValue::Bytes((0..244).into_iter().collect()),
            ]
            .into(),
        );

        let typed_value = TypedValue { ty, value };

        let serialized = serde_json::to_string(&typed_value).unwrap();
        let deserialized: TypedValue = serde_json::from_str(&serialized).unwrap();

        assert_eq!(typed_value.ty, deserialized.ty);
        assert_eq!(typed_value.value, deserialized.value);
    }
}

Thanks for the concrete sample. This looks much better. Here are a few comments:

no other tool will support

As stated, it's obviously well supported without any extra efforts by ethers-rs (ethabi) but not alloy now.

serialize the type alongside the value

This seems weird to me because DynSolValue and DynSolType hold exactly the same type of information (see their definitions). Note you can always do DynSolValue::as_type which almost always gets Some(ty). Of course, I understand the idea is to save information along with the value. But anyway, this requires extra effort to manually do it, though much straightforward compared to serde(with=...) tricks.

Please do not keep opening new issues or PRs for this.

I expect to have a healthy discussion environment instead of closing issues all of a sudden. Thanks again for your efforts.

prestwich commented 4 months ago

This seems weird to me because DynSolValue and DynSolType hold exactly the same type of information (see their definitions). Note you can always do DynSolValue::as_type which almost always gets Some(ty).

Yes, DynSolValue contains a subset of the information contained in DynSolType. And getting into the gap there is not on the common user path, but can be deliberately triggered by any user. So you want your serialization format to handle all input, even input crafted to trigger the None result on as_type

But more importantly, the .abi_encode() blob (the only standard encoding of a DynSolValue) does not contain that type information. We use the DynSolType to decode and annotate the blob, making a DynSolValue instance. Without the full solidity type information, we can't decode ABI blobs at all (which incidentally is one of the reasons ABI is incompatible with serde's data model). The extra type information we have in the structure of DynSolValue exists to help rust code access the information (similar to serde_json::Value) in a predictable way, and is used to ABI encode the value, but is not part of the encoding of the value.

Long story short, it's necessary to keep the sol type info even if we were to encode the recursive structure of DynSolValue (like a serde derive would)

wtdcode commented 4 months ago

This seems weird to me because DynSolValue and DynSolType hold exactly the same type of information (see their definitions). Note you can always do DynSolValue::as_type which almost always gets Some(ty).

Yes, DynSolValue contains a subset of the information contained in DynSolType. And getting into the gap there is not on the common user path, but can be deliberately triggered by any user. So you want your serialization format to handle all input, even input crafted to trigger the None result on as_type

But more importantly, the .abi_encode() blob (the only standard encoding of a DynSolValue) does not contain that type information. We use the DynSolType to decode and annotate the blob, making a DynSolValue instance. Without the full solidity type information, we can't decode ABI blobs at all (which incidentally is one of the reasons ABI is incompatible with serde's data model). The extra type information we have in the structure of DynSolValue exists to help rust code access the information (similar to serde_json::Value) in a predictable way, and is used to ABI encode the value, but is not part of the encoding of the value.

Long story short, it's necessary to keep the sol type info even if we were to encode the recursive structure of DynSolValue (like a serde derive would)

Thanks for the feedback. I understand the motivation. I just mean I would implement this exactly as serde auto derivation does: get a tag for each enum variant (so the type information is not saved as the canonical type string, but recursive mapping instead). But anyway, as stated, my ideal solution is not bloating extra code.

prestwich commented 4 months ago

The point is that the serde derived implementation would fail for some class of valid objects. and any user easily construct a object that triggers that failure

wtdcode commented 4 months ago

The point is that the serde derived implementation would fail for some class of valid objects. and any user easily construct a object that triggers that failure

I fail to get this. Why it would fail for some class of valid objects?

prestwich commented 4 months ago

FixedArray(vec![]) or Array(vec![]) can be serde (de)serialized using the , but not always used effectively in memory without the DynSolType. So the serialization format has failed in that it hasn't protected the user or the program logic

Suppose you make an abi encoding service (e.g. anything that provides eth_signTypedData endpoints). Without the typestring, any user can craft an input that is valid, but causes the endpoint to fail to produce a signature. This is why the type info is required for eip712 objects

wtdcode commented 4 months ago

FixedArray(vec![]) or Array(vec![]) can be serde (de)serialized using the , but not always used effectively in memory without the DynSolType. So the serialization format has failed in that it hasn't protected the user or the program logic

Suppose you make an abi encoding service (e.g. anything that provides eth_signTypedData endpoints). Without the typestring, any user can craft an input that is valid, but causes the endpoint to fail to produce a signature. This is why the type info is required for eip712 objects

I can’t get it. Why not just use: https://docs.rs/alloy-dyn-abi/latest/alloy_dyn_abi/enum.DynSolValue.html#method.abi_encode for any DynSolValue?

I can get that for deserializing abi encoded bytes without type information is not possible but for encoding DynSolValue, it should be fine without type information?

The two array cases you mentioned are due to the lack of element types. But since they have 0 length, it should produce the correct bytes anyway? I don’t have my laptop at this moment but I will try to confirm.

wtdcode commented 4 months ago

Okay, I see after some quick testing.

Did you mean that, it's possible for users to provide some data like: Array(vec![Bool(false), Address(address!("..."))]), which is perfectly valid for serde auto implementation but not for the actual ABI definitions? This seems the exact concern of not deriving (De)/serialize.

However, shall alloy-core also check this when doing as_type? Currently it blindly uses the type of the first element.