Closed rogeralsing closed 9 years ago
actually, it will leak context even if two or more tests are running and are async,
e.g. test 1 sets context in thread 1, test 2 sets context in thread 2 test 1 calls await , test 2 calls await test 1 resumes on thread 2, test 2 resumes on thread 1
Now implicit Sender
is swapped for both tests..
As @nvivo noted on gitter, instead of setting state in a thread static, it could be set in the CallContext. This does in theory solve the issue at hand (at least should do it unless there are more lazer unicorn dragons buried here)
But.. xUnit 1.9.2 blows up if we use CallContext for this.. Hopefully 2.0 is OK with this change.
This would however affect throghput negatively, but I guess correct and slower is better than fast and broken..
Even then its definitely not a silver bullet. This guy explains it perfectly: http://blog.stephencleary.com/2013/04/implicit-async-context-asynclocal.html
Luckily, we ony store the implicit context data like Sender and current message in the call context. So we are not mutating / evolving any datastructure like in the blog above.
cc @StephenCleary we need you! <3 The async await stuff with implicit contexts are killing us
My only worry is that these PRs are only treating symptoms, not root causes. The fact is that these problems should not be happening in the first place. This has nothing to do with akka and async being incompatible at some fundamental level, it's just about implementation.
If a synccontext wrapper is a solution, this can be applied to any environment, not only xunit. That means ditching CallContext, changing dispatchers a little bit and fixing these problems forever. But the same solution can be achieved by keep using just CallContext and adjusting the ActorTaskScheduler to flow these values in the right order.
More issues will appear because of async. We should understand what is causing them first, and mainly have an agreement by the leaders on what the issue is and the direction we should follow to fix them once and for all.
I'm afraid keep accepting patches to very specific problems will only turn akka in a patchwork quilt.
We should understand what is causing them first
ThreadStatic
variables are evil- Current async
Receive
implementation is a one big hack, which can be improved, however. First of all we do need to have a custom SynchronizationContext. It just has to wrap the current one and be switched with push/pop semantics (see xunit test invoker for reference). We also have to suspend system message queue while task returned by the Actor is still running. If we won't do that, actor can be killed while still processing a message being awaiting for a Task whenKill
message arrives. I also think thatAsyncBehavior.Reentrant
should be removed, people should usePipeTo
for that (we can add some syntactic sugar later however).
The only problem with a custom SynchronizationContext I see it that somewhere could be async void
method that captures it. But if it only restores the ambient state and nothing else, it should be completely fine.
I'm afraid keep accepting patches to very specific problems will only turn akka in a patchwork quilt.
That particular case has nothing to do with async actors, dispatchers, etc. It's because of the free context in which xunit runs tests.
That particular case has nothing to do with async actors, dispatchers, etc. It's because of the free context in which xunit runs tests.
This is exactly the short sight I'm talking about. It's all the same problem, this is just treating symptoms.
I'm fine with working around a specific issue to keep the project going. But it's different to say this is fixing a real issue. It's not. Just like #905 didn't fix anything, it's all band-aids.
I totally agree with you @nvivo, but there is more to it.
We have to stick with the API we've got for V1+ , OnReceive returns a bool. so we cannot return tasks. and changing that API at this point is a no go.
We might have the luxury to change this for v2 or simply make it work differently in Akka.Typed.
But right now, we are stuck with the implicit context as that is how JVM Akka is designed and the Receive API looks the way it does.
So we need to focus on whats the correct thing to do in the current position. Changing the overall API is not an option right now.
So what is our best alternative right now?
OnReceive returns a bool. so we cannot return tasks
It can return Task
Our message pipeline is extremely lightweight right now. Making OnReceive return a task, even if it is just a Task.FromResult, gimps our maximum throughput by millions of messages per second.
On my machine it drops from 34 mil messages down to 20 mil messages per sec in the benchmark app, from increased GC pressure alone.
This is a prices that all actors have to pay if we make that change, even if they are not async. Thats a very high price to pay for a feature that is just "nice to have"
A completed task cached in a static variable costs absolutely nothing. No GC pressure, etc
I'm not follwoing, when is a completed task a cached static variable? Just declaring a completed task inside the receive method, has the above described effect on throughput.
e.g.
var dummy = Task.FromResult(true);
This has been tested and is measurable.
Or are you saying we should do something like:
return TaskTrue;
where that task is declared somewhere else?
Another thing that I also don't get is this:
Lets say that OnReceive returns Task
or Task<bool>
and we are OK with any perf losses from it.
Then what?
When should the TaskScheduler or SynchronizationContext be applied? (This is still needed as the implicit actor context is the real issue, not async await itself. so the context needs to async flow.)
For every single message? As that needs to be set before OnReceive is invoked. which is super expensive.
I feel like an Ogre with a peanut brain when discussing this topic as everyone seem to have a very clear idea on how this could be solved while I myself don't :)
Or are you saying we should do something like:
return TaskTrue;
where that task is declared somewhere else?
https://github.com/StephenCleary/AsyncEx/wiki/TaskConstants
When should the TaskScheduler or SynchronizationContext be applied?
Before calling Recieve
in ActorBase
which is super expensive.
Sync context may be a part of ActorCell and also be reused. SynchronizationContext.SetCurrent
is cheap, it just obtains execution context and sets some properties.
private class DummySynchronizationContext : SynchronizationContext
{
}
private static DummySynchronizationContext _context1 = new DummySynchronizationContext(),
_context2 = new DummySynchronizationContext();
public static void Main()
{
SynchronizationContext.SetSynchronizationContext(_context2);
var st = Stopwatch.StartNew();
for (var c = 0; c < 100000000; c++)
{
SynchronizationContext.SetSynchronizationContext(c%2 == 0 ? _context1 : _context2);
}
st.Stop();
Console.WriteLine(st.Elapsed);
Console.Read();
}
This benchmark takes 00:00:01.2104391 to run on my hardware (windows is running in KVM, but still). ~100 million context switches per second.
Oh, found a better the place for SynchronizationContext
switching. The same place, where current actor cell is switched.
@rogeralsing Please, check how much does performance drop on your hardware with proper benchmarking setup.
Wasnt the consensus that SynchronizationContext could not be used? as we do not own the threads we run on.
Wasnt the consensus that SynchronizationContext could not be used? as we do not own the threads we run on.
Please, see implementation.
1) SynchronizationContext
is reset after the call in finally
block, so there is no garbage left in thread's state.
2) ActorCellSynchronizationContext
delegates it's work to original one, so it won't break anything.
Generally, a custom SyncCtx should be avoided. If it's possible to solve this by using the logical call context (.NET 4.5+ only), I would prefer that route. Or if it's possible to access context values through some other means, that would be even better (but a much more substantial change).
SyncCtx is more of a "platform type". UIs have them. ASP.NET vCurrent does have one, but in vNext they are planning to remove it. SignalR explicitly leaves the ASP.NET vCurrent context before executing its user-provided code.
These are not conclusive arguments; just points to consider.
Thanks @StephenCleary! Very good to know about this.
Can You explain why a custom sync ctx should be avoided? And why they are moving away from these things in vnext?
-----Original Message----- From: "Stephen Cleary" notifications@github.com Sent: 27/04/2015 19:47 To: "akkadotnet/akka.net" akka.net@noreply.github.com Cc: "Arjen Smits" Deathraven163@gmail.com Subject: Re: [akka.net] TestKit ImplicitSender - async await broken (#907)
Generally, a custom SyncCtx should be avoided. If it's possible to solve this by using the logical call context (.NET 4.5+ only), I would prefer that route. Or if it's possible to access context values through some other means, that would be even better (but a much more substantial change). SyncCtx is more of a "platform type". UIs have them. ASP.NET vCurrent does have one, but in vNext they are planning to remove it. SignalR explicitly leaves the ASP.NET vCurrent context before executing its user-provided code. These are not conclusive arguments; just points to consider. — Reply to this email directly or view it on GitHub.
To be honest, I'm not entirely sure. I was a bit surprised that the SignalR team was adamantly opposed to introducing a SyncCtx, since I consider SignalR as more of a "platform" than a "library".
On the other hand, xUnit (preview v3) always provides a SyncCtx for their unit tests. But that kind of makes sense, since most code will run in some kind of a SyncCtx at runtime.
My gut feelings are:
1) Some components depend on the SyncCtx being a specific type. This is more than just depending on the behavior of the SyncCtx; they're actually downcasting it. I know some ASP.NET components do this today, and IIRC there's some of it on the UI side as well, for things like checking whether there's already a Dispatcher
for the current thread. So, your SyncCtx can be LSP-safe but other code will still fail.
2) Introducing a SyncCtx changes the behavior of the system. In particular, await
behaves differently in the presence of a SyncCtx.
3) Dealing with SyncCtx is messy in the internal .NET framework code. This is due to a historical error that made SyncCtx part of the ExecutionContext
. So there's these fun code paths where the ExecutionContext
is captured explicitly without the SyncCtx, and then the SyncCtx is captured by other code and used later. It's just plain messy.
I think it's for these reasons that there's a general move away from SyncCtx.
I think we can close the issue for now. Tests have their workaround, reentrancy is removed, ambient context will be removed in Akka.Typed, RunTask/ReceiveActor work fine, God in his heaven, all's right with the world.
I think this has caused #2491. At least removing the custom SynchronizationContext makes the tests happy again.
@kekekeks You mentioned 'tests have their workaround'. Am I missing something I should be doing?
@thomastomanek If I remember correctly, ActorCellSynchronizationContext
was the workaround. But that was a long time ago, infrastructure has changed so it might be not necessary anymore.
Another async await issue incoming..
in the
TestKitBase
, we have this lines:This sets the testActor as the current sender for all tests that are not decorated with
INoImplicitSender
So far so good.
But as test methods can be async, this does not work, for the very same reasons we had to hack the world to get async await to work inside actors.
The implicit context is lost in the
await
and all code running in the test method be incorrect. And worse: if there are two parallel tests running at the same time, where one hasINoImplicitSender
and one hasnt' Its completely possible that the context set above leaks over to the other test if that happens to inherit the same thread while awaiting.So now good folks, here we have correct Task based methods, with broken async await support. As these tests do return tasks, if I get what everyone else is saying here, it should be an easy fix?
(In my head it's not, I still think the implicit actor context needs to async flow in the tests also. But I've been wrong before on this topic :-) )
cc @nvivo @kekekeks @Danthar @HCanber
Someone reported this in a PR recently, that the context had leaked over to another test. And had to set
.Current = null
So, shoot, anyone have any ideas how to fix this?