dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.15k stars 4.71k forks source link

Scale back [Serializable] for .NET Core 2.0 #21433

Closed stephentoub closed 4 years ago

stephentoub commented 7 years ago

We brought [Serializable] and BinaryFormatter back for .NET Core 2.0 in the name of compat and helping code to migrate to core. But it comes with some serious implications, and it's becoming clear we took things too far. Doing this makes it very easy for code to take dependencies on private implementation details and will end up binding our hands for improvements/optimizations/refactorings/etc. in the future. And with all of the improvements/optimizations/refactorings/etc. already made in core, we've had to punt on the idea of cross-runtime serialization support. There needs to be a better happy-medium.

The primary need for BinaryFormatter and [Serializable] is on a few core types, e.g. primitives, some of the core collections (e.g. List<T>), some small wrapper types (e.g. Tuple), etc. Proposal from me and @morganbr for .NET Core 2.0:

  1. BinaryFormatter and friends are currently implemented in System.Runtime.Serialization.Formatters.dll. We keep that as-is.
  2. SerializableAttribute, ISerializable, etc. are all implemented in System.Private.CoreLib.dll and exposed through System.Runtime.dll. We keep that as-is.
  3. We remove most of the [Serializable] implementations we added back for .NET Core 2.0, keeping the most core and important 10% or so. For this set, we a) potentially implement ISerializable if not already implemented in order to separate the implementation from the surfaced serialization state and to minimize the impact of future refactorings and optimizations, and b) try to enable desktop/core serialization/deserialization, so that BinaryFormatter'd state can go back and forth between desktop and core. We'll need to finalize what the list of types is, but it would include all of the primiteves (e.g. Int32), core collection types (e.g. Dictionary), core wrapper types (e.g. KeyValuePair), Exception and important Exception-derived types, and a handful of others. We can add more types in the future as they're highlighted as blockers, but in the meantime we avoid taking on the burden of lots of serializable types that'll hamper our ability to move the platform forward as we'd like.

[edit] list of types we ended up scaling back to is here: https://docs.microsoft.com/en-us/dotnet/standard/serialization/binary-serialization#binary-serialization-in-net-core

stephentoub commented 7 years ago

cc: @danmosemsft, @jkotas, @Zhenlan, @KrzysztofCwalina, @kieranmo, @Petermarcu, @terrajobst, @billwert

morganbr commented 7 years ago

@stephentoub , I'm starting work on a tool to remove SerializableAttributes, stub out ISerializable implementations, and remove OnSerializing et al methods from whatever types we decide not to make serializable. Of course, I won't send out a PR until we get the final set of types.

danmoseley commented 7 years ago

try to enable desktop/core serialization/deserialization

This would mean we would have to take care with our ISerializable implementations to match the names used by desktop ISerializable impoelmentations (or desktop types' private fields if they use reflection). Also desktop needs to be tolerant of extra fields we may at some point want to serialize in there. Are either of those a concern?

This plan will make it feasible to support cross-version on the same platform and that will make it possible to support serialization between CoreRT, UWP, and .NET Core platforms as they should share all of CoreFX and eventually 80% of corelib: almost all the relevant code will be the same anyway. We shouldn't need to concern ourselves with cross-version tests until post 2.0.

stephentoub commented 7 years ago

This would mean we would have to take care with our ISerializable implementations to match the names used by desktop ISerializable impoelmentations (or desktop types' private fields if they use reflection).

Correct

Also desktop needs to be tolerant of extra fields we may at some point want to serialize in there.

Hopefully this will rarely be an issue. But in situations where core needed to add additional state that needed to be serialized, yes, we'd have to factor that in.

jkotas commented 7 years ago

try to enable desktop/core serialization/deserialization

I assume that we are also going to add TypeForwardedFromAttribute to point to the original desktop homes for these types.

stephentoub commented 7 years ago

I assume that we are also going to add TypeForwardedFromAttribute

Presumably we'll have to in order to try to support the cross-runtime serialization.

morganbr commented 7 years ago

Would we also have to add TypeForwardedFrom attributes to desktop to handle types serialized by Core?

weshaggard commented 7 years ago

While we are looking into this can I get you guys to take a look at https://github.com/dotnet/standard/issues/300 as well and help decide if OnDeserializedAttribute, OnSerializingAttribute, OnSerializedAttribute, and OnDeserializingAttribute should be removed from netstandard?

stephentoub commented 7 years ago

decide if OnDeserializedAttribute, OnSerializingAttribute, OnSerializedAttribute, and OnDeserializingAttribute should be removed from netstandard?

You mean whether the attributes themselves should be removed, or whether usage of them should be removed? For the former, I don't know why we'd remove them: whether or not we use them in our implementations, they're a supported part of the runtime serialization / BinaryFormatter scheme, and I'd think we'd want to keep them exposed for usage. For the latter, usage of these attributes represents implementation details in code and shouldn't be part of ref assemblies.

(If there's a bug in desktop related to this, seems we should fix the bug rather than hampering standard. My $.02.)

weshaggard commented 7 years ago

I was more talking about whether or not the APIs should be exposed. Right now I'm leaning towards your suggestion which is to leave them broken and get the fix in desktop but I wanted other opinions. If there are any other suggestions lets not derail this issue lets just have them at https://github.com/dotnet/standard/issues/300.

danmoseley commented 7 years ago

This is not going to be done for ZBB if we are to get it right.

morganbr commented 7 years ago

Here's a breakdown of work remaining:

stephan-tolksdorf commented 7 years ago

Please reconsider removing [Serializable] from the System.Text.Decoder types. AFAIK, serialization is currently the only way to persist the state of a decoder, which e.g. the FParsec parser combinator library needs in order to enable efficient backtracking in huge input streams (see https://github.com/stephan-tolksdorf/fparsec/blob/master/FParsecCS/Cloning.cs and https://github.com/stephan-tolksdorf/fparsec/blob/master/FParsecCS/CharStream.cs).

morganbr commented 7 years ago

@stephentoub , do you have suggestions for @stephan-tolksdorf ?

Decoder and friends have pretty different implementations between NetFX and Core so I'd think field-for-field compatibility between them would be undesirable. Additionally, I suspect they have a sizable graph of types that would also have to be serializable.

danmoseley commented 7 years ago

Expanded bullet list above and added @ViktorHofer for the testing.

danmoseley commented 7 years ago

@morganbr I believe you and @stephentoub have settled on the supported list. Can you post it here for all to see?

morganbr commented 7 years ago

The list is at https://gist.github.com/morganbr/2a9901b12f1d342285694e69983a12d0 . It's grown slightly due to dependencies of the types we originally chose (mostly by adding several comparers). It's possible the desktop compatibility changes could pull something else in, but right now, I'm not aware of anything else we'd add.

joperezr commented 7 years ago

@morganbr @ViktorHofer any updates on how this work is going? just want to make sure that we are still on track for 2.0

danmoseley commented 7 years ago

@joperezr we're tracking it closely :)

danmoseley commented 7 years ago

@krwq

jkotas commented 7 years ago

Serialize canned blob from Core on Desktop

This will require adding TypeForwardedFromAttribute attributes on the serialized types to store the original Desktop home in the serialized payload.

marek-safar commented 7 years ago

Could we instead of removing [Serializable] completely use some #if-def or partial type? Mono is reusing CoreFX code and wants to still be .NET compatible. Cross platform serialization is nice to have for us but removing Serializable from types which have it on .NET breaks any program which uses appdomain/remoting/asp.net session/etc even if it runs on single platform only.

karelz commented 7 years ago

My understanding was that it is more than just he attribute. We would need to preserve private field names, etc. -- see original post:

But it comes with some serious implications, and it's becoming clear we took things too far. Doing this makes it very easy for code to take dependencies on private implementation details and will end up binding our hands for improvements/optimizations/refactorings/etc. in the future

In general ifdef for Mono or partial/special class for Mono sounds like reasonable approach, assuming it is doable without blocking CoreFX to move the only-ifdef-Mono types forward as highlighted above. It might be easier to just have a chat to better understand Mono requirements and our options in the space. Thoughts?

stephentoub commented 7 years ago

My understanding was that it is more than just he attribute

It depends on Mono's needs, and whether for the specific type in question it just cares about being able to serialize/deserialize from that specific platform, or whether for the type in question it cares about being able to interop with desktop. Mono and CoreFx can have different goals/requirements for different types.

stephentoub commented 7 years ago

Could we instead of removing [Serializable] completely use some #if-def or partial type?

I think a partial type approach would be very reasonable; we could just mark the relevant types in corefx as partial, and then Mono could build in its own partial files with the appropriate attribution / serialization implementations that it wants. That should work for any type where either a) no fields need to be attributed as [NonSerializable]/[OptionalValue]/etc., or b) where corefx still has fields attributed as such.

marek-safar commented 7 years ago

@stephentoub yes, that'll work for simple cases only. Are you ok to have different approach for the rest?

stephentoub commented 7 years ago

that'll work for simple cases only. Are you ok to have different approach for the rest?

I expect it would work for most cases, anything where the partial file could provide the whole serialization implementation. It wouldn't work in cases where fields in the main file needed to be attributed and weren't, or where some of the serialization implementation was provided, but provided in a different manner than Mono needs, e.g. throwing a PlatformNotSupportedException.

For such cases, I think it'd be helpful to look at them on a case-by-case basis as they arise to see what kinds of issues we're dealing with. It may be that in some cases we need to split out certain pieces into another partial file, one providing the serialization implementation core uses (e.g. a deserialization ctor that throws, a field without an attribute on it, etc.) and one providing the serialization implementation Mono uses (e.g. a deserialization ctor that behaves however Mono needs it to, the same fields but with attributes applied, etc.)

What I'm not clear on yet is Mono's exact policy around serialization and compatibility. If Mono wants to share the source code for a type in core, but core uses different field names than does desktop, or even has an entirely different implementation / layout than desktop, what would Mono do? What is the list of types for which Mono guarantees roundtrip serialization/deserialization compatibility with desktop? Beyond that list, what guarantees does Mono make about serialization/deserialization? Just that it'll work in process, but not necessarily between runtimes, or between major versions of Mono, or...? Understanding that may help us all better figure out a story here.

ViktorHofer commented 7 years ago

Status of the remaining work to make serialization work between desktop and core:

cc @danmosemsft @stephentoub

lmolkova commented 7 years ago

Please do not remove Serializable attribute from the System.Diagnostics.Activity and it's nested types in future. dotnet/corefx#20640

danmoseley commented 7 years ago

Complete. @morganbr we need to put your list somewhere, perhaps you could put it on this repo's wiki? (Remove namevaluecollection)

danmoseley commented 7 years ago

Need an issue for these Remove unnecessary attributes and callbacks (NonSerialized, OnSerializing, etc) from non-serializable types (post 2.0) [up for grabs] CoreFX CoreCLR CoreRT (Nice to have only) Add Roslyn analyzer to build to check nothing has [Serializable] that's not on the approved list (post 2.0) [up for grabs]

ViktorHofer commented 7 years ago

Need an issue for these

Added

ViktorHofer commented 7 years ago

Also noteworthy, I created a script a few weeks ago which prints all serializable types (all access modifiers). This is the updated version from current master: https://gist.github.com/ViktorHofer/2da00443c8284c93f0aea5402c11238e

danmoseley commented 7 years ago

Updated: list of types we ended up scaling back to is here: https://docs.microsoft.com/en-us/dotnet/standard/serialization/binary-serialization#binary-serialization-in-net-core

mgravell commented 5 years ago

Can I ask a question on this? I found this today, when a DataSet failed to serialize correctly when RemotingFormat is SerializationFormat.Binary, because System.Data.SimpleType isn't [Serializable]. Is there a preferred route forward here? or is it just "sometimes, DataSet won't serialize"?

Note: DataSet is in the "blessed list"

karelz commented 5 years ago

@mgravell this is closed issue, we usually don't monitor them. If you don't get answer soon, I recommend to file a new issue with your question ...

ViktorHofer commented 5 years ago

DataSet only supports the non SerializationFormat.Binary option.

danmoseley commented 5 years ago

I answered the same question here https://github.com/dotnet/corefx/pull/20220#issuecomment-494890200

DataSet only supports the non SerializationFormat.Binary option.

Is that right @ViktorHofer - if so why do we not throw immediately when used? I thought you made DBNull serializable specifically for this ? https://github.com/dotnet/corefx/pull/23897