dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.55k stars 4.54k forks source link

Add a good Binary Serializer #102535

Open VenkateshSrini opened 1 month ago

VenkateshSrini commented 1 month ago

The BinaryFormatter is now marked as obsolete. Though there are many Serializer like MessagePack and Protobuf. Each have its overhead and limitation when performing serialization. I would suggest the .NET team to look into MessagePack Serialization kind of implementation with support for polymorphic serialization and compression OOB. In case where there is possibility of UnSafe serialization that might happen, they can be shown as compiler warnings which developer can correct.

huoyaoyuan commented 1 month ago

Though there are many Serializer like MessagePack and Protobuf. Each have its overhead and limitation when performing serialization.

They are just "good" serializers. The limitations are something you must pay for.

support for polymorphic serialization

This is where danger comes.

compression OOB

This is another layer that can be done separately. It will definitely harm performance for real cases. If you data can reach the size to benefit from compression, you should probably use database-like solutions.

The overhead of MessagePack and Protobuf-like serializers are close to minimal for making a "correct" serializer. Dumping memory blocks is fundamentally incorrect for persisted storage.

Clockwork-Muse commented 1 month ago

BinaryFormatter wasn't necessarily a more-performant serializer, and there's nothing inherently more performant about how it worked compared to third-party packages. The fundamental unsafety was unrelated to performance (and entirely due to how it worked generally).

neon-sunset commented 1 month ago

Not everything has to live in CoreLib or extension packages of .NET itself. Binary serialization is a perfect example of specialized solution that is best served by standalone libraries. Besides MessagePack, there are multiple performant binary serde packages for .NET like MemoryPack.

VenkateshSrini commented 1 month ago

Though there are many Serializer like MessagePack and Protobuf. Each have its overhead and limitation when performing serialization.

They are just "good" serializers. The limitations are something you must pay for.

support for polymorphic serialization

This is where danger comes.

compression OOB

This is another layer that can be done separately. It will definitely harm performance for real cases. If you data can reach the size to benefit from compression, you should probably use database-like solutions.

The overhead of MessagePack and Protobuf-like serializers are close to minimal for making a "correct" serializer. Dumping memory blocks is fundamentally incorrect for persisted storage.

So can we build and provide the developers that will have less overheads and features like Polymorphic serialization and handle at least 95% of scenario and also provide a way to extend the functionality in case of some custom serialization is required

VenkateshSrini commented 1 month ago

Not everything has to live in CoreLib or extension packages of .NET itself. Binary serialization is a perfect example of specialized solution that is best served by standalone libraries. Besides MessagePack, there are multiple performant binary serde packages for .NET like MemoryPack.

Is there some kind of library that we can act as a adapter for some best serialization libraries like Memory Pack, Message Pack, protobuf and other serializers? Something like EF, where we can define what we want to do in Library agnostic way, then choose a plug in library for popular serializer like Message Pack, Memory Pack, Protobuf etc. We would prefer not to add any attribute or implement any interface to existing classes. It should consume the class as it is. Is this at least good thought? Idea is to switch between libraries seamlessly

huoyaoyuan commented 1 month ago

Is there some kind of library that we can act as a adapter for some best serialization libraries like Memory Pack, Message Pack, protobuf and other serializers? Something like EF, where we can define what we want to do in Library agnostic way

The was a very old DataContractSerializer that tries to adapt Xml and Json. It's considered a failure because the difference of Xml and Json are too significant. EF is mostly for relational databases, which have many similarity. I'm not an expert about the binary formats, but generally I'm pessimistic about this.

VenkateshSrini commented 1 month ago

But then if we have a standard way of defining things and then have the underlying l

Is there some kind of library that we can act as a adapter for some best serialization libraries like Memory Pack, Message Pack, protobuf and other serializers? Something like EF, where we can define what we want to do in Library agnostic way

The was a very old DataContractSerializer that tries to adapt Xml and Json. It's considered a failure because the difference of Xml and Json are too significant. EF is mostly for relational databases, which have many similarity. I'm not an expert about the binary formats, but generally I'm pessimistic about this.

Help define what needs to be done in a common way and then have underlying library handle this is one way or other. It could be inline what we are doing with IMessageFormatters. I'm not saying this is easy. Also, The community that supports this libraries should also come together to have such contracts

dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.