Cysharp / MemoryPack

Zero encoding extreme performance binary serializer for C# and Unity.
MIT License
3.01k stars 182 forks source link

POCO structs are treated as unmanaged despite GenerateType.VersionTolerant #300

Open Rurouni opened 1 month ago

Rurouni commented 1 month ago

Hi, we've recently run into an unforeseen backward compatibility issue in such scenario:

1) let's say we have such struct in protocol

[MemoryPackable(GenerateType.VersionTolerant)]
public partial struct SomeStruct
{
    [MemoryPackOrder(0)] public int SomeField1;
    [MemoryPackOrder(1)] public int SomeField2;
}

2) next we add another field

[MemoryPackable(GenerateType.VersionTolerant)]
public partial struct SomeStruct
{
    [MemoryPackOrder(0)] public int SomeField1;
    [MemoryPackOrder(1)] public int SomeField2;
    [MemoryPackOrder(2)] public int NewField;
}

Expectation: based on docs, we expected the old data would be still compatible just result in NewField getting default value on deserialisation

Actual Behavior: The struct is treated as unmanaged type, and the generated formatter does basic memcopy, ignoring GenerateType.VersionTolerant

From my initial investigation, it appears that problem lies in https://github.com/Cysharp/MemoryPack/blob/4b1e3f889fbc8bae6e700bc148f6dbba0a4b7170/src/MemoryPack.Generator/MemoryPackGenerator.Parser.cs#L112 so as soon as Roslyn tells us that it's unmanaged struct we treat it as unmanaged type disregarding GenerateType in attribute

This is quite dangerous, as one can expect the protocol to be backwards compatible if you use GenerateType.VersionTolerant and adheres to rules, but even adding a new property will make old data unreadable.

Thank you

Rurouni commented 1 month ago

@neuecc Sorry for pinging you directly, but can you please take a look at this issue? I can write an MR with a fix but I might need some pointers on where is the best place to correct this check

neuecc commented 4 weeks ago

Thank you. In this case, the principle from the README that we should follow is:

Therefore, if we were to add anything, it would be to include a warning that VersionTolerant cannot be specified for unmanaged structs.

Rurouni commented 3 weeks ago

Wouldn't it be better to generate version-tolerant formatter code instead when struct is unmanaged but attribute is present? In some cases, you might have user-space structs in a shared game codebase between client and server that get reused in player profile data stored in a database, for example. In this case having version tolerant serialisation/deserialistion is a must-have. I don't think it's even a warning; it should be an error because the line between "unmanaged" and not "unmanaged" struct (from Roslyn's perspective) is not that obvious to the average developer, but the consequences are dire and only detected when much later down the line you can't suddenly break backward compatibility

Technically, there is a non-obvious workaround already, for example, if someone slightly changes the struct definition, for example, by adding some kind of reference type field that is always empty, and now suddenly formatter generated code is version tolerant.

neuecc commented 2 weeks ago

If we allow that, it will lead to inconsistencies in various places. The primary check for Serialize<SomeStruct> or Serialize<SomeStruct[]> is whether they are Unmanaged.

To maintain consistency, we would need to refactor the entire system to always prioritize the VersionTolerant check over the Unmanaged check.

Personally, I think struct that changes frequently should be made as class.

Rurouni commented 2 weeks ago

I see. In that case, analyser error is probably the best way forward indeed