ethereum / py-evm

A Python implementation of the Ethereum Virtual Machine
https://py-evm.readthedocs.io/en/latest/
MIT License
2.26k stars 649 forks source link

Trinity: Need a transparent way to run services in their own process. #1075

Closed pipermerriam closed 6 years ago

pipermerriam commented 6 years ago

What is wrong?

The problem statement: Our current model for exposing APIs across processes is very verbose, requiring a lot of boilerplate. The overhead for supporting an API across a subprocess boundary is high. High overhead like that isn't sustainable if we're going to expose a lot of cross process APIs which I think we need.

It needs to be lower, ideally much lower. We don't need 100% deep access to everything across service boundaries, but we do need it to be easier to expose access to APIs in a sustainable way.

How can it be fixed

I'm not sure, but I have ideas. (see comments)

pipermerriam commented 6 years ago

Thoughs in no particular order.

Nice to have

cburgdorf commented 6 years ago

Whatever that will be, it will be needed for the plugins as well which could either mean, there are different kind of base plugins that use that, or that the plugin manager will use it or idk. I think it makes sense to come up with something that doesn't live inside the plugin facilities but is designed to work well with them (whatever that will mean in practice)

type checks work across API boundaries

I'm tempted to raise priority for that. The loose typing of the proxy types right now is really nasty and already caused us quite some bugs in the past. I remember I once looked into it and there was something strange going that prevented me from coming up with something to raise type safety.

pipermerriam commented 6 years ago

Noting here that #1101 is probably a great issue to use to figure this pattern out.

cburgdorf commented 6 years ago

TLDR:

I was about to create a separate issue but now I think a comment here may be better suited. I started thinking about how we could move the transaction pool into it's own process. The main problem here is the interaction with the PeerPool which currently lives in the same process as the transaction pool but would then live outside.

Currently, the transaction pool contains code such as:

https://github.com/ethereum/py-evm/blob/e9f1216d8d3efe0c465fe0c5cb2a79b0fa3f2154/trinity/plugins/builtin/tx_pool/pool.py#L90-L105

This is going to be problematic in a multiprocess world as the peers are stateful objects that can't just be serialized/deserialized without messing up that state (e.g. state about in-flight requests etc)

My current thinking is, that in an architecture that relies heavily on decoupled processes, we should largely communicate with shallow, pickable events. E.g. in the context of the PeerPool that could mean that the PeerPool would raise PeerJoined /PeerLeft events on a generic, application wide event bus (see #1172 for a PoC) where these events would contain a representation of a Peer that really only is a very shallow pickable representation of a peer without actual functionality. A simple Data Transfer Object (DTO) basically.

Similarly, other processes could receive a set of all connected peers to loop over (might actually be able to avoid that) but those would also really just loop over a set of DTOs instead of the fat objects we use today.

However, that DTO would contain some unique id that the actual PeerPool (that lives in another process) can match against its internal stateful representation of peers.

In that model a separate process such as the transaction pool wouldn't directly call receiving_peer.sub_proto.send_transactions(filtered_tx) but rather something like eth_protocol_service.send_transactions(dto_peer, transactions). Notice that the eth_protocol_service does live in the same process as the PeerPool. Basically, I think that all the communication with the peers should go through that single process.

So, behind the scenes the eth_protocol_service can look up the correct peer (based on that unique id in the DTO) and perform the actual networking with the peer.

Also note that the transactions here aren't critical as those qualify as non-stateful, pickable DTOs already.

pipermerriam commented 6 years ago

cc @cburgdorf

I think I'm quickly approaching being in full agreement with you. Here's a PR with me experimenting with how to move the JSON-RPC server to a dedicated process:

https://github.com/ethereum/py-evm/pull/1191

It's pretty gross. some of that is just me doing a quick/dirty implementation to see what it is like. However, there is a decent bit of boilerplate needed to share objects across process boundaries, and in general it is very fragile.

The event bus idea is sitting well with me, however, there are a few things that I believe will still require us to use some of the multiprocessing shared objects. The main one that comes to min is eth_call which will need Chain class locally that uses the ProxyBaseDB under the hood to get at the database while keeping the CPU stuff in the JSON-RPC process.

cburgdorf commented 6 years ago

I think I'm quickly approaching being in full agreement with you

Happy to see our ideas are converging!

Here's a PR with me experimenting with how to move the JSON-RPC server to a dedicated process

I'll check it out in a minute.

However, there is a decent bit of boilerplate needed to share objects across process boundaries, and in general it is very fragile

The event bus idea is sitting well with me, however, there are a few things that I believe will still require us to use some of the multiprocessing shared object

I agree. It's my current thinking that we will need some of the proxy stuff. In fact, I guess many plugins may end up providing an RPC style API based on these proxy types to be called from other processes. However, I think it should be minimized as much as possible. There are several reasons why I think so:

  1. RPC style based on proxies is less explicit, hiding the fact that we are in fact doing one of the most costly operations possible (that is, cross process communication). Or to cite from that article

Messaging as a communication concept is very much different from RPC in that it does not attempt to hide the physical aspects of communication. It is still trying to hide the implementation details, but not to the point of dismissing the notions related to run-time costs of exchanging data

  1. So far, typing around proxies seems to have unsolved problems which were causing us lots of runtime errors in the past (thing added on actual type not added on proxy etc). It's probably fixable but we haven't found a good solution for it afaik

  2. I believe RPC style with proxies is less efficient and robust than messaging in general. I don't have hard numbers to back this but I've worked on distributed systems in .NET back in 2010 and it's been the common sense in the .NET community that RPC style should be avoided. And I think this has evolved in a language / framework agnostic believe. Related short article that I found: http://www.inspirel.com/articles/RPC_vs_Messaging.html

So, while my current thinking still includes that we'll need some of the proxy stuff, I think the less we need of it, the better. I think that we could even build more APIs on top of the message bus that would allow to further get rid of RPC style calls.

pipermerriam commented 6 years ago

@cburgdorf maybe you can take #1191 and do another iteration on it for a POC for changing it to use an event bus to retrieve the peer count.

cburgdorf commented 6 years ago

Will do. I have the head full of thoughts but I'm on mobile (on a beautiful Croatian beach btw) so I'll share them another time.

On Fri, Aug 17, 2018, 18:02 Piper Merriam notifications@github.com wrote:

@cburgdorf https://github.com/cburgdorf maybe you can take #1191 https://github.com/ethereum/py-evm/pull/1191 and do another iteration on it for a POC for changing it to use an event bus to retrieve the peer count.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ethereum/py-evm/issues/1075#issuecomment-413911752, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfzldb1ZLzR1rj8pLgXxqUAL153S8RAks5uRuj6gaJpZM4VcFJl .

pipermerriam commented 6 years ago

I've removed this from the "Dorothy Vaughan" release as I believe we have the following tasks blocking this (all of which I think need to have their own issues written up)

cburgdorf commented 6 years ago

@pipermerriam here's a super rough and buggy PoC

https://github.com/ethereum/py-evm/pull/1202

cburgdorf commented 6 years ago

While there is certainly lots of room for improvements, I think the gist of the issue has been addressed.

  1. Services that run in separate processes should be written as plugins derived from BaseIsolatedPlugin

  2. The way to communicate with other processes is through the lahja eventbus.

  3. A reference implementation exists with the JSON-RPC server that was recently refactored into a plugin.

https://github.com/ethereum/py-evm/blob/1006923c7d2f4f6e6317e172757967c7a4b7bf02/trinity/plugins/builtin/json_rpc/plugin.py#L30-L70

I'm closing this for now. Feel free to reopen if you think we should keep this open.