paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.88k stars 687 forks source link

serde for RuntimeString is broken #4001

Closed nazar-pc closed 3 days ago

nazar-pc commented 7 months ago

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Description of bug

I just had a strange issue when I tried to serialize RuntimeString on the client into JSON and then deserialize in the runtime. Turned out serde implementation for RuntimeString is broken because between client and runtime owned value will be String or Vec<u8> and as the result one end will serialize a String and another will try to deserialize it as a vector of bytes, resulting in confusing errors.

I have no idea what problem RuntimeString is even solving, looks like what you want is just Cow<'static, str> unless I miss something.

I hit this due to sp_genesis_builder::GenesisBuilder trying to decode JSON.

Steps to reproduce

No response

ggwpez commented 7 months ago

Maybe it would work to give the variants explicit indices?
Currently it looks like the String and Vector variant share the index depending on the std feature flag. But it seems hacky...

(yes could be that the whole type is a bit useless though cc @michalkucharczyk)

bkchr commented 7 months ago

I have no idea what problem RuntimeString is even solving, looks like what you want is just Cow<'static, str> unless I miss something.

This comes from a time when we did not had Encode and Decode enabled in SCALE for no_std :see_no_evil: Now, this is probably a good idea to change this.

michalkucharczyk commented 7 months ago

Does it mean you have RuntimeString field somewhere in your RuntimeGenesisConfig?

We could use serde variant attributes to deserialize Owned as String in nostd.

bkchr commented 7 months ago

I think the better solution is just to change RuntimeString to Cow as @nazar-pc said. Otherwise I would propose that we just write some custom Serialize/Deserialize.

nazar-pc commented 7 months ago

Does it mean you have RuntimeString field somewhere in your RuntimeGenesisConfig?

Yes

I think the better solution is just to change RuntimeString to Cow as @nazar-pc said. Otherwise I would propose that we just write some custom Serialize/Deserialize.

The only reason for RuntimeString to exist that I see is that it allows to bypass UTF-8 verification in runtime in some cases. Not sure how critical it is though.

bkchr commented 7 months ago

This comes from a time when we did not had Encode and Decode enabled in SCALE for no_std 🙈

I mean this is the reason why this exists, because I know the guy that added the RuntimeString :P

nazar-pc commented 2 months ago

Recommended course of action to fix this?

bkchr commented 2 months ago

If moving it to Cow<'static, str> works, I would propose we do it this way.

nazar-pc commented 2 months ago

My only concern is that it'll be a breaking change for JSON serialization, hence the question. If someone stored serialized RuntimeString in actual runtime, they'll be surprised to see it not working anymore (I guess we can work around that by implementing custom deserializer that will support decoding bytes as well, but that feels ugly).

bkchr commented 2 months ago

Can we even differentiate between bytes or string encoded?

nazar-pc commented 2 months ago

I think the answer very much depends on the serialization format. We can definitely tell that in JSON and probably some other self-describing formats, but I don't think it is universally true.

bkchr commented 2 months ago

If someone stored serialized RuntimeString in actual runtime, they'll be surprised to see it not working anymore (I guess we can work around that by

Just thought about this again. Actually if they stored it as SCALE encoded, the encoding is the same and decoding should not fail between Vec<u8> and String.

nazar-pc commented 1 month ago

I went a bit further and removed the data structure as redundant in https://github.com/paritytech/polkadot-sdk/pull/5693