txdv / LibuvSharp

.NET bindings for libuv
176 stars 41 forks source link

'TcpAsync' example fails #4

Closed neoxic closed 10 years ago

neoxic commented 10 years ago

Don't know if I'm doing it "right", but I just cloned everything, compiled and built clean and shiny on Arch Linux.

Running TcpAsync.exe immediately leads to this:

$ ./TcpAsync.exe
Starting example.
From Server: Hello World!

Unhandled Exception:
System.InvalidOperationException: The underlying Task is already in one of the three final states: RanToCompletion, Faulted, or Canceled.
at System.Threading.Tasks.TaskCompletionSource`1<System.Nullable`1<System.ArraySegment`1<byte>>>.SetResult (System.Nullable`1<System.ArraySegment`1<byte>>) <0x00057>
at LibuvSharp.Threading.Tasks.IUVStreamExtensions/<ReadAsync>c__AnonStorey15.<>m__1E () <0x00047>
at LibuvSharp.Handle/<Close>c__AnonStorey0.<>m__0 () <0x0006c>
at LibuvSharp.CAction.PrivateCallback () <0x00022>
at (wrapper native-to-managed) LibuvSharp.CAction.PrivateCallback () <0x0006c>
at (wrapper managed-to-native) LibuvSharp.Loop.uv_run (intptr,LibuvSharp.uv_run_mode) <0x00060>
at LibuvSharp.Loop.Run () <0x00017>
at LibuvSharp.Threading.Tasks.LoopExtensions.Run (LibuvSharp.Loop,System.Func`1<System.Threading.Tasks.Task>) <0x001e7>
at Test.MainClass.Main (string[]) <0x000a3>

From Client: Labas Pasauli!
mono: src/unix/stream.c:1074: uv_shutdown: Assertion `((stream)->io_watcher.fd) >= 0' failed.
Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) LibuvSharp.UVStream.uv_shutdown (intptr,intptr,LibuvSharp.Handle/callback) <0xffffffff>
  at LibuvSharp.UVStream.Shutdown (System.Action) <0x0026f>
  at LibuvSharp.IUVStreamExtensions.Shutdown (LibuvSharp.IUVStream) <0x00018>
  at Test.MainClass/<Server>c__async0.MoveNext () <0x00433>
  at (wrapper unbox) Test.MainClass/<Server>c__async0.MoveNext () <0xffffffff>
  at System.Threading.Tasks.ActionContinuation.Execute () <0x00019>
  at System.Threading.Tasks.Task.ProcessCompleteDelegates () <0x00053>
  at System.Threading.Tasks.Task.Finish () <0x00157>
  at System.Threading.Tasks.Task`1.TrySetResult (TResult) <0x000eb>
  at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetResult (TResult) <0x00047>
  at AsyncExtensions/<ReadStringAsync>c__async2.MoveNext () <0x001b7>
  at (wrapper unbox) AsyncExtensions/<ReadStringAsync>c__async2.MoveNext () <0xffffffff>
  at System.Threading.Tasks.ActionContinuation.Execute () <0x00019>
  at System.Threading.Tasks.Task.ProcessCompleteDelegates () <0x00053>
  at System.Threading.Tasks.Task.Finish () <0x00157>
  at System.Threading.Tasks.Task`1.TrySetResult (TResult) <0x000eb>
  at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetResult (TResult) <0x00047>
  at AsyncExtensions/<ReadStringAsync>c__async3.MoveNext () <0x002ff>
  at (wrapper unbox) AsyncExtensions/<ReadStringAsync>c__async3.MoveNext () <0xffffffff>
  at System.Threading.Tasks.ActionContinuation.Execute () <0x00019>
  at System.Threading.Tasks.Task.ProcessCompleteDelegates () <0x00053>
  at System.Threading.Tasks.Task.Finish () <0x00157>
  at System.Threading.Tasks.Task.ThreadStart () <0x0061f>
  at System.Threading.Tasks.Task.Execute () <0x0000f>
  at System.Threading.Tasks.TpScheduler.TaskExecuterCallback (object) <0x00047>
  at (wrapper runtime-invoke) <Module>.runtime_invoke_void__this___object (object,intptr,intptr,intptr) <0xffffffff>
txdv commented 10 years ago

Yeah, I found the bug.

https://github.com/txdv/LibuvSharp/blob/master/LibuvSharp/Threading/Tasks/IUVStreamExtensions.cs#L17-L18

Complete and Data get called together before the delegates get unhooked. I will submit the patch once I'm back home but async/await seems not to work properly anyway. I tried to write a small webserver and I have multiple strange artifacts that I can not comprehend or explain. Like data of the a first request being sent only after I do a second one and so on...

I will to test it on windows and look whether it is a mono bug or my async implementation is wrong.

neoxic commented 10 years ago

Unfortunately, it looks like your approach has a lot of weaknesses. I've spent last several months in an internal project integrating 'libuv' and C# from scratch and have encountered an enormous number of subtleties.

To begin with, callback delegates should be assured to be alive until the end of any possible callback from 'libuv'. So, it's a separate problem to get them available and properly garbage collected in all scenarios. It's not sufficient just to pass a delegate to a P/Invoked call because if it doesn't have a strong link somewhere, it can be (and will be) garbage collected right after (or even before!!!) the actual call to C-code has been made.

Another thing is to synchronize handle and loop disposals. You need to be aware of the GC and the user all the time to get things to work reliably.

And the main thing is that I ended up writing a separate layer in pure C to actually integrate into C# to hide some internals like memory allocation, structure formats and callback signatures. The approach of duplicating libuv's structures in C# is extremely ephemeral between versions. I can't imagine how you will manage to preserve offsets of fields each time a new minor version is out (I'm not even talking about major switches when the API changes - I've met that already being on v.11).

As to the async/await pattern, I found out that it has some noticeable overhead if used for every potential read/write operation. Having first implemented these operations with it, I made comparisons with the same tests in pure C and had a difference around 4-5 times. Gave up using async/await for actual data reading/writing.

Hopefully, I will be able to open the project in some future time. So everyone can benefit.

txdv commented 10 years ago

The callbacks are an easy fix.

I guess this applies to the write functions where each write operation can have its own callback most. We just have to remember to store them until they are called, right?

The different positioning in structs ... I think I'll try to cook up something with clang to generate them all. For now I checkout the version I want to use and then use diff git diff to check for differences in the struct.

As for performance, yeah, I already thought that might happen. Have you tried using the callback based interfaces? Measured the performance? I think async await should be anyway used in high level abstractions like webrequest and all the other goodies.

As for your own C layer, if you keep your project for yourself, nobody will be able to join in the effort and help you. The earlier you share the better. Especially since I would join the train immediately.

neoxic commented 10 years ago

Indeed, they are an easy fix unless it's appropriate to create a delegate on every call to write. I'm not ok with that because it's wasteful to create a delegate each time you write a couple of bytes. The similar problem actually slows down the async/await pattern - a task and its completion source is created along with a lot of internal calls behind the scenes. Eventually, the code spends more time in GC than in I/O.

Yes, I have already tried dry static callbacks. And the performance is very close to the C versions of the tests. The drawback is about 10% which is quite suitable for me.

Unfortunately, I am not free to choose whether to open the project because I am doing it under a contract for a company. But I promise to do my best to convince the company to open it in the future.

txdv commented 10 years ago

Well, one doesn't have to use the write with the callback, it is just a nice additional thing.

Another additional reason why it is slower with async/await is that because in my implementation I am constantly calling Resume and Pause. This could be optimized away, because libuv now exposes a try_write which tries to write immediately, but doesn't work on windows.

We have the pattern of

while (true) {
    await ReadAsync(); // after the call pause is called, before the call resume is called
}

We could instead check at the beginning of every loop iterations if we are still reading and Pause/Resume only when the reading state has changed.

Furthermore we could implement our own TaskCompletionSource, because TCS is thread safe and what not, is a class (we could use a struct, so the GC wouldn't care much about it).

Then there are the IAwaitable/IAwaiter interfaces, I don't know yet if we could use them to optimize the code.

There is a lot of room to improve without jumping right to a c wrapper. I didn't start with all optimizations right away because I thought it would be nicer to have the functionality first.

txdv commented 10 years ago

O, another thing that I remember is that the default ByteBufferAllocator is pretty wasteful, that could be optimized as well.

txdv commented 10 years ago

https://github.com/txdv/LibuvSharp/commit/ec205bcca4880712a16cc1b010cb63e56a00c00e fixes the original issue

neoxic commented 10 years ago

Well, I actually did implement my own awaited type (much like TCS+Task), but it should still be a class, because there's no common stack between two callbacks in any time, and a delegate should still be created and bound in a call to OnCompleted(). So there's no room to optimize here. But I'm fine with that because I use it only for calls like connect or shutdown.

I also don't use any "unsafe" code because it looks like it's pretty easy to shoot your own foot even with plain P/Invoke with all its hassle to leave unsafe blocks alone.

As to higher abstractions for async/await, I agree with you. In general, the thing is ok for its purpose when not abused.

txdv commented 10 years ago

I guess you won't release the source code of your wrapper.

neoxic commented 10 years ago

As I told earlier, I will do everything in my power to convince the company to open it in the future.

txdv commented 10 years ago

inform us on the companies decision even if they don't let you open source it