partisan instead of teleport?

tsloughter commented 7 years ago

As I was once again looking to revive https://github.com/vagabond/teleport I realized partisan might actually be meant for this itself. Description of teleport https://vagabond.github.io/programming/2015/03/31/chasing-distributed-erlang

Basically, is partisan meant to be used for message passing between nodes in place of distributed Erlang? For some reason I had it in my head it was for fault detection and gossip only, not for my applications processes to be sending direct messages between each other.

cmeiklejohn commented 7 years ago

Lasp uses Partisan for all internode communication. This is the direction we are continuing to both research and develop, so yes, I consider Partisan an alternative to teleport.

While I consider teleport an interesting engineering solution to the problem of getting around distributed Erlang; (and, I had my own version of it called ringleader and Andrew's teleport, both motivated by problems internally discovered when building Riak MDC replication) I do not believe teleport is sufficient alone for building large-scale applications on Erlang with various topologies: I think the solution teleport provides is only ideal for adapting existing Erlang applications to avoid head-of-line blocking and flow control issues with distributed Erlang.

That said, Partisan is aimed at becoming a middleware solution for supporting large-scale Erlang applications: fault detection, reliable membership, and reliable broadcast and unicast.

cmeiklejohn commented 7 years ago

For what it's worth, the Lasp fork of Plumtree uses partisan for membership and all of the underlying transport communication between members of the broadcast tree: this is the primary difference from the Basho / Helium maintained fork.

cmeiklejohn commented 7 years ago

Further comments: (tagging @Vagabond)

partisan serializes messages using term_to_binary and then passes them onto gen_tcp:send. It wasn't completely clear from Andrew's article whether this is a bottleneck or not. If not, we should adapt partisan to use the BIF he had?
What is the best encoding format to use between nodes? (I assume something where there's a BIF available that's better / worse than term_to_binary / binary_to_term. )
Lazy connection management doesn't work for us: it assumes you know membership ahead of time and you connect when you want to send messages. Partisan's failure detection mechanism is performed using connection maintenance.
Partisan can trivially be extended for multiple connections per host, but we're unclear what this means for failure detection: this is why we haven't done it yet.
Parse transform is neat, and we'd love to take that.

I'm open to hear more feedback.

cmeiklejohn commented 7 years ago

Generally speaking, I think it would be valuable to identify the useful contributions in teleport and migrate them over, given we've done significant stress testing of this library, and that this library provides higher level services than just connection maintenance.

tsloughter commented 7 years ago

Regarding term_to_binary, I think he ended up finding that term_to_iolist he wrote works better than anything else he tried.

tsloughter commented 7 years ago

And yea, multiple connections between hosts is a must. Not sure if it could be done separately from the failure detection connection easily? Then those could also be lazy if the user wanted.

cmeiklejohn commented 7 years ago

If multiple connections are a must, then are we going to try to preserve FIFO ordering between messages or not?

tsloughter commented 7 years ago

I would assume not. Unless there was a low overhead way to do it for only messages to the same process. Maybe even just mapping the tuple, {To,From} to the same connection from the pool, shrug, that is likely to cause problems.

Vagabond commented 7 years ago

I think I sharded packets across the connection pool from node-to-node based using the source-destination tuple as the shard key, so process-process messaging order was preserved. Total messaging order is obviously an entirely other kettle of fish. I also punted on reply delivery.

Some of the other points, distributed erlang cheats a lot to be fast, Ideally you'd have something like partisan for 'regular' disterl, and maybe track communication networks between processes, and in the case of a particularly hot path between 2 processes, spawn up a dedicated connection between them instead of using the global disterl port? You could use process_flags or something like that to indicate if a process was allowed to opt in to such a network optimization.

I would argue that message ordering beyond the ordering between 2 processes is not strongly relied upon in most Erlang code and can be sacrificed to gain network performance in many cases.IIRC the message ordering in distributed erlang isn't that strongly guaranteed anyway.

evanmcc commented 7 years ago

My read, taking Christopher's points in order:

it sounds like it was a bottleneck, hence the teleport:term_to_iolist code to lessen GC pressure.
IMO binary_to_term is well tested, fast, and doesn't collapse the scheduler much anymore.
I don't think that this is critical in any way to the functioning of the library, we should be able to port stuff over without much of an issue.
I imagine that we'd need it for our use-case; I can think of several things that it'd make simpler. I think your options there boil down to the equivalent of one_for_one (i.e. if any connection is still a live, the remote node is still connected) or one_for_all (i.e. if any connection dies, all are terminated and the node is considered disconnected).
Andrew says that while he mentioned one in the blog post, the transform never got written, but that it'd be easy enough to do.

I think that pulling stuff over into partisan (or adding teleport as a dep and just using the things that we need) is reasonable, it's just a matter of what order to do things in, I think. I think the first thing I'd do is work on a design for multiconnection (since that's the biggest change semantically, I think), then work on a parse transform, then pull over the performance stuff as an optimization if needed. Does this seem reasonable?

Misc:

Benoit has a branch here with some improvement and recent use: https://gitlab.com/benoitc/teleport/

evanmcc commented 7 years ago

Correction: it looks like Benoit's code is more 'inspired by' than 'based on'.

cmeiklejohn commented 7 years ago

I think I sharded packets across the connection pool from node-to-node based using the source-destination tuple as the shard key, so process-process messaging order was preserved. Total messaging order is obviously an entirely other kettle of fish. I also punted on reply delivery.

Is this guaranteed across disconnects/reconnects, though? A simple hashing function (even chash in some cases given particular join behavior) doesn't appear to be enough to capture this concern.

cmeiklejohn commented 7 years ago

Some of the other points, distributed erlang cheats a lot to be fast, Ideally you'd have something like partisan for 'regular' disterl, and maybe track communication networks between processes, and in the case of a particularly hot path between 2 processes, spawn up a dedicated connection between them instead of using the global disterl port? You could use process_flags or something like that to indicate if a process was allowed to opt in to such a network optimization.

We originally had the default connection manager do full mesh with disterl, but then I think it was eventually replaced with TCP. We could bring back that, though.

cmeiklejohn commented 7 years ago

I would argue that message ordering beyond the ordering between 2 processes is not strongly relied upon in most Erlang code and can be sacrificed to gain network performance in many cases.IIRC the message ordering in distributed erlang isn't that strongly guaranteed anyway.

Agreed; we want to have and support applications where FIFO is not required. However, this highlights a particularly interesting point, that existing applications that rely on this behavior (such as, many internal Ericsson modules) would not be able to take advantage of our work.

cmeiklejohn commented 7 years ago

I think that pulling stuff over into partisan (or adding teleport as a dep and just using the things that we need) is reasonable, it's just a matter of what order to do things in, I think. I think the first thing I'd do is work on a design for multiconnection (since that's the biggest change semantically, I think), then work on a parse transform, then pull over the performance stuff as an optimization if needed. Does this seem reasonable?

Sounds great to me.

I'd say that first we want to focus on getting SSL/TLS working, and then go multi-connection, if that makes sense with everyone else.

evanmcc commented 7 years ago

Starting to actually work on this for vonnegut, I noticed that there are few things that we need that partisan doesn't currently provide:

remote calls and casts (and various other gen_* messages)
remote monitoring for the above
via support for the above.

basic versions of 1 and 3 are probably pretty easy to add, but 2 is a fair bit of complexity add. Is this something that you're interested in, or had you only envisioned partisan working with message oriented protocols?

cmeiklejohn commented 7 years ago

We're interested in all three of the proposed things and are willing to provide support where possible. Maybe we can open an issue for each of these items and put together a plan for integrating them. Would that be an acceptable approach?

evanmcc commented 7 years ago

cool.

since the existing code uses the ServerSpec unchanged, I am not sure that 3 is actually needed.

I have unidirectional versions of send, cast, and call working (up from cast). I'll open issues for the other two, and try to see whether 3 is actually needed before opening one there.

cmeiklejohn commented 7 years ago

since the existing code uses the ServerSpec unchanged, I am not sure that 3 is actually needed.

What do you mean?

evanmcc commented 7 years ago

Via support is just a special tuple: {via, MyReg, [some, args]} so as long as that is faithfully serialized and deserialized, it should just work.

see the definition of ServerRef here: http://erlang.org/doc/man/gen_server.html#call-3

lasp-lang / partisan

partisan instead of teleport? #42