Closed pairbit closed 1 year ago
please provide more details
added benchmark
Thanks, it's too big a difference, so it's natural to be concerned.
Decompiled result of Hyper
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static Span<byte> Serialize(Person obj)
{
var offset = 0;
var offsetWritten = 0;
var len = 41 + (obj.Name?.Length ?? 0) * Unsafe.SizeOf<System.Char>() + (obj.Mother?.Length ?? 0) * Unsafe.SizeOf<System.Char>() + (obj.Father?.Length ?? 0) * Unsafe.SizeOf<System.Char>();
Span<byte> bytes = new byte[len];
var _Id = (System.Guid)obj.Id; MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 16), ref _Id);
int _Name = (obj.Name?.Length ?? -1) * Unsafe.SizeOf<System.Char>(); MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 4), ref _Name);
if (_Name > 0) { var b = bytes.Slice(offset += offsetWritten, offsetWritten = _Name); MemoryMarshal.Cast<char, byte>(obj.Name.AsSpan()).CopyTo(b); }
var _Age = (System.Int32)obj.Age; MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 4), ref _Age);
var _IsDeleted = (System.Boolean)obj.IsDeleted; MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 1), ref _IsDeleted);
var _Created = (System.DateTime)obj.Created; MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 8), ref _Created);
int _Mother = (obj.Mother?.Length ?? -1) * Unsafe.SizeOf<System.Char>(); MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 4), ref _Mother);
if (_Mother > 0) { var b = bytes.Slice(offset += offsetWritten, offsetWritten = _Mother); MemoryMarshal.Cast<char, byte>(obj.Mother.AsSpan()).CopyTo(b); }
int _Father = (obj.Father?.Length ?? -1) * Unsafe.SizeOf<System.Char>(); MemoryMarshal.Write(bytes.Slice(offset += offsetWritten, offsetWritten = 4), ref _Father);
if (_Father > 0) { var b = bytes.Slice(offset += offsetWritten, offsetWritten = _Father); MemoryMarshal.Cast<char, byte>(obj.Father.AsSpan()).CopyTo(b); }
return bytes;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Person Deserialize(ReadOnlySpan<byte> bytes)
{
Person obj = new();
var offset = 0;
var offsetWritten = 0;
int len0 = 0;
obj.Id = (System.Guid)MemoryMarshal.Read<System.Guid>(bytes.Slice(offset += offsetWritten, offsetWritten = 16));
var _Name = (Int32)MemoryMarshal.Read<Int32>(bytes.Slice(offset += offsetWritten, offsetWritten = 4));
obj.Name = (_Name >= 0) ? MemoryMarshal.Cast<byte, char>(bytes.Slice(offset += offsetWritten, offsetWritten = _Name)).ToString() : null;
obj.Age = (System.Int32)MemoryMarshal.Read<System.Int32>(bytes.Slice(offset += offsetWritten, offsetWritten = 4));
obj.IsDeleted = (System.Boolean)MemoryMarshal.Read<System.Boolean>(bytes.Slice(offset += offsetWritten, offsetWritten = 1));
obj.Created = (System.DateTime)MemoryMarshal.Read<System.DateTime>(bytes.Slice(offset += offsetWritten, offsetWritten = 8));
var _Mother = (Int32)MemoryMarshal.Read<Int32>(bytes.Slice(offset += offsetWritten, offsetWritten = 4));
obj.Mother = (_Mother >= 0) ? MemoryMarshal.Cast<byte, char>(bytes.Slice(offset += offsetWritten, offsetWritten = _Mother)).ToString() : null;
var _Father = (Int32)MemoryMarshal.Read<Int32>(bytes.Slice(offset += offsetWritten, offsetWritten = 4));
obj.Father = (_Father >= 0) ? MemoryMarshal.Cast<byte, char>(bytes.Slice(offset += offsetWritten, offsetWritten = _Father)).ToString() : null;
return obj;
}
Decompiled result of MemoryPack(eddited,slimed)
static void IMemoryPackable<Person>.Serialize<TBufferWriter>(ref MemoryPackWriter<TBufferWriter> writer, scoped ref Person? value)
{
if (value == null)
{
writer.WriteNullObjectHeader();
goto END;
}
writer.WriteUnmanagedWithObjectHeader(7, value.@Id);
writer.WriteString(value.@Name);
writer.WriteUnmanaged(value.@Age, value.@IsDeleted, value.@Created);
writer.WriteString(value.@Mother);
writer.WriteString(value.@Father);
}
static void IMemoryPackable<Person>.Deserialize(ref MemoryPackReader reader, scoped ref Person? value)
{
if (!reader.TryReadObjectHeader(out var count))
{
value = default!;
goto END;
}
global::System.Guid __Id;
string __Name;
int __Age;
bool __IsDeleted;
global::System.DateTime __Created;
string __Mother;
string __Father;
if (count == 7)
{
if (value == null)
{
reader.ReadUnmanaged(out __Id);
__Name = reader.ReadString();
reader.ReadUnmanaged(out __Age, out __IsDeleted, out __Created);
__Mother = reader.ReadString();
__Father = reader.ReadString();
goto NEW;
}
}
// trimed other code(versioning, SET:, etc...)
NEW:
value = new Person()
{
@Id = __Id,
@Name = __Name,
@Age = __Age,
@IsDeleted = __IsDeleted,
@Created = __Created,
@Mother = __Mother,
@Father = __Father
};
END:
return;
}
Both formats are very similar in that they are sequential and write/read the memory data itself as much as possible. However, there are several factors that make a difference in performance.
While there are significant advantages to dealing with UTF16 as is, MemoryPack chose UTF8 because of the tradeoff of doubling the payload size in ASCII code. 50% reduction is a level of compression that would be very difficult to achieve with a general-purpose compression library.
Also, if the benchmark payload is large enough to fit into the LOH, there will be a significant performance loss. Therefore, in terms of performance 50% drop has an impact.
Utf16 by selecting MemoryPackOptions.Utf16.
In MemoryPack, after writing to MemoryPackWriter, if a byte[] is to be retrieved, it is ToArrayed at the end. This is because the final buffer size is not known during the serialization process. If you had an array of objects and you were calculating the size of all of them, you would run the traversal twice.
Hyper has the strong restriction that types that require computation are Unsupported, and there is always the assumption that serialized types are fixed-length. Therefore, you can write directly to the final buffer size byte[].
Even in MemoryPack, in .NET 7, if the types are all fixed length, we write to a fixed byte[]. However, String is treated as variable-length (UTF8!), so the example type does not pass such an optimization.
MemoryPack passes through the type MemoryPackWriter/Reader for input/output flexibility (IBufferWriter and ReadOnlySequence support). This has significant performance advantages as well, for example, it can connect directly to Kestrel's PipeBody.
Hyper is completely inline because it is dedicated to reading and writing bytes[] only. Coupled with the fixed-length format specification, this has the advantage of avoiding length checks.
MemoryPackWriter/Reader also has optimizations that try to avoid length checks whenever possible, such as writer.WriteUnmanaged(value.@Age, value.@IsDeleted, value.@Created);
but the String, etc., in between, the optimization breaks down.
I do not consider Hyper's specification to be a competitor because it is not suitable for a general-purpose serializer. However, I would not like to see a 3x performance difference (even if I change to UTF16). I would like to optimize it a bit more.
thanks for such a detailed answer