akkadotnet / akka.net

Canonical actor model implementation for .NET with local + distributed actors in C# and F#.
http://getakka.net
Other
4.69k stars 1.04k forks source link

Better remoting support #1

Closed rogeralsing closed 10 years ago

rogeralsing commented 10 years ago
rogeralsing commented 10 years ago

Currently, when a RemoteActorRef is resolved, it will spawn a new connection. this is not the expected behavior. Instead, it should reuse an existing connection to the remote system if one exists.

rogeralsing commented 10 years ago

Protobuf serialization works.

We need to resolve remote connections via actors. each unique connection should have its own actor which is responsible for all the IO.

rogeralsing commented 10 years ago

The approach here would be something along:

1) a RemoteActorRef is resolved 2) once we have the remoteactorref, we need a connection for it 3) we query an network IO actor which internally checks its child list to find an connection actor for the given remote system. 4) the existing or newly created connection actor is returned and used from within the remoteactorref

resolve RemoteActorRef -> query master IO actor -> resolve child network actor -> return and use in the RemoteActorRef.

Maybe we should introduce a new top level guardian "/io" for this. and keep all network actors under that.

rogeralsing commented 10 years ago

See https://github.com/akka/akka/tree/master/akka-actor/src/main/scala/akka/io

There are all sorts of nice stuff in there. there is a TcpConnection Actor, an OutgoingTcpConnection actor etc.

Hope you like Scala ;-)

rogeralsing commented 10 years ago

RemoteActorRefs should be given a RemoteTransport object when created. There is one RemoteTransport per ActorSystem (?) And there are many "Transport"s per system the Transport seems to be what connects two systems.

https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/transport/Transport.scala

I probably need to ask some of the Akka devs on how this works...

rogeralsing commented 10 years ago

https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/Remoting.scala#L282 This seems to be where the association between addresses and actorrefs is handled.

I guess we have to roll our own approach for this

rogeralsing commented 10 years ago

I've done some refactoring that will affect this one. All remoting related stuff now resides in the RemoteActorRefProvider.

rogeralsing commented 10 years ago

I've prepped a bit for this one. There is now a RemoteTransport class that should be the base of the remoting support. In Akka, the Remoting.scala inherits from RemotingTransport There is also an EndpointManager which is an actor responsible for handling connections.

rogeralsing commented 10 years ago

Done some more prepping. The EndpointManager is now resolving Transports via the config. And the new RemoteActorRefs are implemented the same way as in real Akka.

rogeralsing commented 10 years ago

@Aaronontheweb I have removed the old remoting support. Instead we are using a semi-port of the Akka remoting, so we are no longer leaking connections. Should be safe for you to use the new impl now.

rogeralsing commented 10 years ago

related to #90 , be aware that the current EndpointActor is hardcoded to use Tcp just to get things up and working... this should not be the case, all of that should be hidden behind the current Transport.. cc. @Aaronontheweb

Aaronontheweb commented 10 years ago

Started off with the fundamentals on this one - rewriting the EndpointActor into an abstract base class and I'm breaking out the EndpointWriter / EndpointReader into their own implementations.

However, upon going through the Akka source in detail today I decided to beef up the Transport implementation to follow the Akka spec more closely.

On my own fork / branch, I've implemented the TestTransport and the TestTransportSpec - all of whose methods currently pass: https://github.com/Aaronontheweb/akka.net/blob/remote-fixes/src/Akka.Remote/Transport/TestTransport.cs

https://github.com/Aaronontheweb/akka.net/blob/remote-fixes/test/Akka.Remote.Tests/Transport/TestTransportSpec.cs

Looks like I have all of the basic plumbing for connecting the transports directly to the individual actors in a way that abstracts them sufficiently.

Keeping everything in my fork for the time being since I broke the TCP Transport layer earlier - should have that fixed by tomorrow.

I'll squash all of the commits and send in a pull request within 48 hours - I don't think I'll have the reliable messaging stuff implementing (ACKing), but I should have the endpoint actors, transports, and the endpoint management + supervision all working with initial support for TCP.

Aaronontheweb commented 10 years ago

Brief update here...

It's been a lot more work than I thought to port over everything needed for reliable TCP support in Akka.NET - there's just a large number of dependencies built into it.

So here's where we are:

COMPLETED

WORKING ON Here's what I'm currently working on:

Going to try to accomplish all of that this weekend. I'm sure I'll stumble across more stuff that needs to be implemented in order to accomplish those, but we'll see how fast I can go!

NOT DOING There are some features in the remoting system that, while useful, are outside the scope of what needs to be accomplished for the V1 release and will not be addressed until afterwards.

I'll keep you updated!

Aaronontheweb commented 10 years ago

Quick update on this issue - currently implementing dual server / client channel support to Helios, a networking middleware library that I wrote in preparation for my own Actor system.

I'm pretty far along with the infrastructure - it should be comparable to Netty both in terms of end-developer experience and performance characteristics, although given that Netty is a robust / stable / mature / and generally awesome framework there are likely some differences / feature gaps / bugs that I have yet to uncover with Helios.

Goal is to integrate Helios back into Akka.Remote and provide both UDP and TCP support out of the box.

Aaronontheweb commented 10 years ago

Update for this issue, because other folks working on the dev branch have noticed (#116)

Completed the HeliosTransport and the HeliosTcpTransport today - had to make A TON of changes to Helios over the past few days and there are still some issues with it that I need to resolve, but they're detail-oriented. The core of the Helios stuff works pretty well at the moment.

I was able to make some changes to Config and verify that RemoteSettings can now load correctly, and it correctly uses a built-in Remote.conf as a fallback for any remote configuration settings.

I have to make a number of changes to RemoteActorRef starting tomorrow and once that's done I think I'll be able to release a new build of Helios to NuGet and ship a stable release of Akka.Remote.

Aaronontheweb commented 10 years ago

Err, RemoteActorRefProvider - not RemoteActorRef

Aaronontheweb commented 10 years ago

So good news on this - have all of the infrastructure I wrote for this up and running and we're able to accept inbound connections now.

There are still some issues with properly sending user-defined data over outbound connections, and I suspect it's a dumb bug buried somewhere inside the ProtocolStateActor FSM. I'm in the process of porting the RemotingSpec unit test over to help isolate the issue.

The FailureDetector also works marvelously - it's able to correctly send heartbeats over the network and receive them back on regular intervals, and it will shut down the transport if I keep the debugger open for too long ;)

I have a bit more work left to do on this before I send in the final pull request to close this issue. Almost there!

I'll create a subsequent issue for adding UDP support (should be easy to do, but requires a bit of manual garbage collection on termination.)

Aaronontheweb commented 10 years ago

Resolved.