paritytech / json-rpc-interface-spec

29 stars 3 forks source link

Revert "transactionWatch: Limit the number of active subscriptions and clarify error codes" #150

Closed tomaka closed 3 weeks ago

tomaka commented 6 months ago

Reverts paritytech/json-rpc-interface-spec#146

To give more details on my comment:

Let's say that a JSON-RPC client repeatedly connects to a JSON-RPC server, submits a transaction, then disconnects, thousands of time.

If the client uses transaction_broadcast, the logic of the API says that the transactions being broadcasted stop being broadcasted when the JSON-RPC client disconnects. Consequently there's no problem for the server, as it can simply throw away the transactions when the client disconnects. As explained in the DoS attack chapters, the JSON-RPC server is supposed to have a limit on the number of simultaneously-connected JSON-RPC clients. Since each client is only guaranteed 4 transactions, it means that there's a global maximum of 4 * n transactions being broadcasted by the server at any given time (where n is the maximum number of clients).

Note that the JSON-RPC server might actually continue to broadcast them if it so desires, because once a transaction has been broadcasted, there's nothing that the JSON-RPC client can do that can guarantee that the transaction will not be included. In other words, after the broadcasting has started, there's no big red button that the client can press in order to stop it. What calling transaction_stop or disconnecting does is potentially liberate resources on the JSON-RPC server, which is what we care about here.

However, if the client uses transactionWatch_submitAndWatch, then the server is supposed to keep the transactions and continue broadcasting them even when the client disconnects. Once you got the JSON-RPC response to transactionWatch_submitAndWatch, and assuming that you believe that your transaction is valid, you can disconnect and you have all the reasons to believe that the transaction will make its way on chain (this has been discussed before, but if you generate a transaction using a tool such as PAPI, there's no reason to believe that the transaction would be invalid, unless you generate the same transaction from multiple machines at the same time, or unless you think that there's a bug in your tool, but you generally shouldn't write code that tries to guard against bugs, because that code might have bugs as well, and if we go in that direction then there might also be a bug on the JSON-RPC server's code).


Once a transactions enters the pool, it goes in three phases:

In order to prevent malicious clients from making the server use an unreasonable amount of CPU or memory, you want to have a global (i.e. shared between all clients) maximum number of transactions being concurrently validated, and a global maximum number of transactions concurrently being gossiped. In our example of JSON-RPC clients that repeatedly connects/submits/disconnects, at some point these two lists are going to be filled up, which is problematic.

How this can be handled depends on the situation: either the malicious client submits invalid transactions, or the malicious client submits valid transactions.

If the transactions are invalid, then a full node JSON-RPC server can solve this problem by validating the transactions and discarding them before sending back the response to transactionWatch_submitAndWatch. In other words, the list of transactions being validated only contains transactions from connected clients. This guarantees that the size of this list is bounded, similar to transaction_broadcast as explained above.

If the transactions are valid, then the malicious actor will fill up the list of transactions being gossiped. However, since they are valid, the malicious will pay some fees, and thus the attack becomes costly for the attacker (very costly if it submits a ton of transactions). Since the list is full, the server has no choice but to prevent new transactions (which includes transactions from legitimate clients) from being submitted, which means that the DoS attack from the malicious actor is succeeding (it successfully "denies service" to the legitimate clients), but because the attack is costly it makes it okay. Importantly, the server must not discard transactions from disconnected clients, as that would remove the cost of the attack for the attacker.

This was for full nodes. On a light client, validating a transaction is long, and thus this DoS attack would work. However, light clients are supposed to be local-only, and thus aren't supposed to have malicious JSON-RPC clients that connect/disconnect. They are out of the equation here.


So to go back to this PR.

Without #146, the server can implement the DoS-protection measures that I've described in the previous paragraph, which are:

With #146, however, the server would have no choice but to drop validated transactions from disconnected clients in order to make room for the 4 guaranteed transactions per client. This makes it impossible to properly prevent this specific DoS attack.


You might argue that we should maybe modify transactionWatch_submitAndWatch so that the transaction stops being broadcasted when the client disconnects.

To me, it's convenient to be able to connect, submit, and disconnect. It means that you can write for example a bash script that submits a transaction with curl, without having to write complicated code that parses the notifications and wait for the included or finalized event.

transactionWatch_submitAndWatch is about being convenient after all, while transaction_broadcast is for doing it properly.

However, the fact that it's so complicated to get the implementation right makes it very unelegant. If there's a consensus to drop transactions when the client disconnects, I'm not against doing so.

jsdw commented 3 weeks ago

I guess we can close this since it merged in https://github.com/paritytech/json-rpc-interface-spec/pull/153, but re-open if needbe!