ethereum / aleth

Aleth – Ethereum C++ client, tools and libraries
GNU General Public License v3.0
3.96k stars 2.17k forks source link

Project idea: HTTP Proxy for JSON-RPC #4563

Open chfast opened 7 years ago

chfast commented 7 years ago

Project idea: HTTP Proxy for JSON-RPC

Motivation

I'd like to remove HTTP server from C++ Ethereum node. The node should only expose RPC access by the most primitive transport: Unix Sockets and Named Pipes on Windows. The HTTP transport should be provided by an external tool translating HTTP requests/responses to/from the protocol required by Unix Sockets and Named Pipes.

Notes about design

  1. The proxy may be done in a language other than C++. Go looks friendly.
  2. The proxy can provide other transports than HTTP, e.g. WebSocket, TCP.
  3. The node can still support --rpc flag. It should execute and configure the proxy as another process. This is quite common in Unix word. This way git supports SSH, etc.

Privileges

Problem

We would like to have different (and configurable) permissions for RPC module depending on transport protocol. Let's assume there are 2 RPC modules: eth with public blockchain data and admin. We must allow accessing both admin and eth modules by Unix Sockets and Named Pipelines on the same time not allowing access to admin with HTTP. User should also be able to configure modules access permissions per transport protocol.

This can be more complicated if we consider allowed HTTP hosts.

Solution 1: Blacklisting

This should match current geth behavior, where by default all modules can be accessed via Unix Sockets and only some modules can be accessed by HTTP.

In this solution all modules are accessible by default by Unit Sockets and Named Pipes. When the proxy process is started it may send a special message (can be JSON-RPC message) with the information what modules are to be disabled. The node must allow this message to be send only once per connection.

Solution 2: Whitelisting

Similarly to solution 1, but this time none module is enabled by default. The proxy must send a special massage listing the modules to be enabled. The node must allow this message to be send only once per connection.

This would require also changes to tools like ethereum-console and geth attach. They will also have to send the whitelist on startup.

Solution 3: Access token

I noticed that in C++ the admin RPC module requires a special token to be passes as a part of JSON-RPC request. I think the token is generated every node startup.

pirapira commented 7 years ago

What happens when we drop HTTP without adding the proxy? The Travis scripts on Solidity and Bamboo use Unix Sockets.

chfast commented 7 years ago

What happens when we drop HTTP without adding the proxy? The Travis scripts on Solidity and Bamboo use Unix Sockets.

At the moment probably nothing. But I'd like to at least have a plan and a design how to add the support back in future. This just looks to me like a nice inter-team small project.

chriseth commented 7 years ago

What is the polling overhead of the different technologies? I heard that web sockets are quite good in that regard.

holiman commented 7 years ago

I think it sounds like a good idea to have the http-rpc as a standalone thing. The interface could be a lot more refined if it was independant (custom certificates, multiplexing, access controls etc).

What is the polling overhead of the different technologies? I heard that web sockets are quite good in that regard.

Yup, websockets are essentially tcp-sockets, so as long as it's not closed, the server can push.

bas-vk commented 7 years ago

I have though about something a couple of months ago. Mainly because of the reasons holiman mentioned. Such a proxy could also do account management and handle signing request (provide a UI to the user to authorize a signing request). Nodes would only operate on public data.

Websockets begin their live as http connections and are upgraded to websockets. Geth already supports it and allows a client to subscribe to events such a new headers and logs. The client will receive a notification that contains the event data. No polling required.

chfast commented 7 years ago

Such a proxy could also do account management and handle signing request (provide a UI to the user to authorize a signing request). Nodes would only operate on public data.

I was thinking about this as a separate project. Such agent would have access to accounts' keyfiles and be placed in the middle of RPC communication. It would translate "personal" requests into "raw" requests. E.g. personal_sendTransaction into eth_sendRawTransaction.

But maybe it is not bad idea to merge this 2 projects into a single one.

axic commented 6 years ago

One of the downsides of the Unix socket approach is framing of the JSON (also not sure that named pipes allow multiple connections?)

The framing in Web Sockets (see https://tools.ietf.org/html/rfc6455#section-5.2) could be used and that would turn this proxy pretty transparent as it would only need to add the HTTP framing on top.

chfast commented 6 years ago

Both Unix Sockets and Named Pipes servers recognize individual connections. Is that you question @axic?

I'm not sure what is framing about? To reuse single connection for multiple independent streams?

I don't like Web Socket approach as the base transport for RPC, because the connection available to every user of the machine and you some additional authentication mechanism to be added. Am I right?

axic commented 6 years ago

Framing is about knowing where the message boundaries are. Current IPC relies on streaming JSON decoders to determine messages boundaries. There are two long, heated threads about this though :)

Websockets doesn't define any authentication or encryption, that is provided by HTTP. Though I only mentioned Websockets' framing, which is the actual message passing protocol after Websockets has been negotiated over HTTP (to avoid reinventing the wheel).

chfast commented 6 years ago

So the goal is to know where the JSON message ends without parsing the JSON? I've seen that libjson-rpc-cpp has also TCP transport and it uses special char to delimit the messages. See https://github.com/cinemast/libjson-rpc-cpp/blob/master/src/jsonrpccpp/server/connectors/tcpsocketserver.h#L19-L22.

By default "new line" \n is used to delimit messages, probably because "new line" (and other control characters) are not allowed in JSON strings directly.

chfast commented 6 years ago

Hm... I think I missed the fact that nice formatted JSON contains new line chars.

chfast commented 6 years ago

Unix Socket Authentication

It is possible (but probably in not portable way) to get process and user id of a connection.

This information can be used to implement blacklist: if we know that the HTTP proxy is process N, we can limit privileges of connections from process N.

However, I'm not convinced with this solution in case the default access level is unrestricted. It would be to easy for users to spin of proxies on their own that without restrictions applied.

chfast commented 6 years ago

@karalabe if you have some free time, can you point us to the packages that are used for JSON RPC in geth?

chfast commented 6 years ago

I created this PoC in Python: HTTP to Unix Socket proxy: https://github.com/chfast/json-rpc-proxy.

karalabe commented 6 years ago

All our rpc boilerplate is in the 'rpc' package/folder in our repo root.

trungtt198x commented 6 years ago

The scripts/jsonrpcproxy.py is nice for enabling HTTP-based RPC-interaction with the eth node. However, it would have been nicer to keep the options such as "--json-rpc" and "--json-rpc-port" in the source code so that we could have more choices of interest !

One should have introduced some compiler flag for turning on/off the option json-rpc. For MiniUPnP, there is such a nice compiler flag !! Why not same for json-rpc ?

stvenyin commented 6 years ago

websocket http proxy server is josn解码 unix tcp socket many connection

stvenyin commented 6 years ago

hava not a request