jefffhaynes / BinarySerializer

A declarative serialization framework for controlling formatting of data at the byte and bit level using field bindings, converters, and code.
MIT License
290 stars 62 forks source link

Parallel serialization not working correctly #194

Open pjavelasco opened 1 year ago

pjavelasco commented 1 year ago

Hello, I have the following method using BinarySerializer:

    protected static async Task<byte[]> ConvertBinaryObjectToByteArrayAsync(object obj)
    {
        var binarySerializer = new BinarySerializer
        {
            Endianness = Endianness.Little
        };
        using var ms = new MemoryStream();
        await binarySerializer.SerializeAsync(ms, obj);
        return ms.ToArray();
    }

When I use this method in async way (I created several tasks and wait for all), tasks do not run in parallel way, the total time for all tasks is the sum of the time of each task.

I think "SerializeAsync" has a problem inside like a static constructor invocation or something like that.

Thanks in advance

jefffhaynes commented 1 year ago

Which tasks are you expecting to run in parallel? Tasks do not imply thread parallelism, they can be used to create parallelism in certain cases.

pjavelasco commented 1 year ago

For example:

var task1 = ConvertBinaryObjectToByteArrayAsync(list1);
var task2 = ConvertBinaryObjectToByteArrayAsync(list2);
var task3 = ConvertBinaryObjectToByteArrayAsync(list3);

await Task.WhenAll(task1, task2, task3);

My intention is to execute the 3 tasks in parallel, this is, in separated threads.

If I replace await binarySerializer.SerializeAsync(ms, obj); by Task.Delay(randomtime); works correctly.

jefffhaynes commented 1 year ago

Can you give me an example of an object definition that causes this to happen? Thanks

EDIT: I've tried a few different versions of this now and I can't get it to fail. But you're right, there are static constructors that could be causing issues, I'm just not quite able to see where...

bevanweiss commented 1 year ago

@pjavelasco You might need to supply an actual test case here. Something that compiles and runs and that shows your problem.

I'm suspecting that you're doing something like passing in the same stream object to each invocation.

I also think you might be confusing multi-threading and asynchronous operation. https://learn.microsoft.com/en-us/dotnet/csharp/asynchronous-programming/task-asynchronous-programming-model Async does NOT mean you spin up a separate thread to perform the workload. It means that the task has aspects of it that likely involve awaiting a result from somewhere else (like network / disk / user input), and that it might be beneficial to perform other CPU tasks whilst it is performing this awaiting. Async typically happens on the same thread (unless you specifically do something else to spin up a thread for it).

Your example of replacing SerializeAsync(...) by Task.Delay is an example. Task.Delay doesn't spin off a separate thread to do work. It just waits, for a certain period of time. During this period of time THE SAME THREAD can be doing additional work progressing other bits of code asynchronously. If each thing that you await is mostly CPU bound, then you will not see any benefits to async, it's only when its something non-CPU bound (i.e. network / disk / user) that you can see benefits (more useful work per time period).

I did see one part in the current async code that potentially has some synch locking problem, if using a LengthPrefixedString, it's written using the synchronous Write method. So it would 'stall' whilst writing this value to the stream. But this is not the droid you're looking for...

        public async Task SerializeAsync(...)
        ...
                case SerializedType.LengthPrefixedString:
                {
                    if (constLength != null)
                    {
                        throw new NotSupportedException(LengthPrefixedStringsCannotHaveConstLengthMessage);
                    }

                    writer.Write(value.ToString());
                    break;
                }
       ...

https://github.com/jefffhaynes/BinarySerializer/blob/732c2c412253490645c48350ebb4b4c7f4e97a49/BinarySerializer/Graph/ValueGraph/ValueValueNode.cs#L404

jefffhaynes commented 1 year ago

There actually is no overload for async string writing here. However, I don't think this would cause a significant issue unless this was somehow a giant string, slow stream, etc.