Consider making downward adjustments to RPC values in .env

unsystemizer commented 6 years ago

I did some testing and found I get slower, but still acceptable performance with very minimalist settings: just 1 RPC concurrent and only 32 RPC requests per batch (compared to 32 and 500, respectively, in https://github.com/bitcoinjs/indexd/blob/v0.8.2/example/.env#L5).

These tests from the table were done as follows: a) Start bitcoind on testnet and let it catch up with the network b) Delete indexd data, start indexd and stop it when it catches up I used bitcoin 0.15.1rc1 and indexd 0.8.2. This system has plenty of resources and uses SSD.

In prior days (not on the chart) using the default indexd .env settings with bitcoind adjusted to 32 RPC threads and queue of 500, I'd get (address) index built in as little as 70 minutes, but with a lot more dropped messages and (IMO) unnecessarily high stress on bitcoind.

With my minimalist settings I also got the lowest number of missed ZMQ messages. Your mileage may vary (I didn't have any application (from indexd clients) workload) but IMO indexd is an important, but not critical service, and my lowest settings work just fine while not placing bitcoind at risk (while we know that aggressive settings can kill bitcoind).

I would suggest to change the defaults to values that aren't higher than the default related bitcoind settings, so that those do not have to be changed (so, not more than 4 RPC threads on bitcoind). I'm not sure how batch sizes are sized, but if bitcoind's default rpcworkqueue is 16, maybe RPCBATCHSIZE should be 16 as well (maybe even 8 or 4 would work well - because the initial values were so high, I spent a lot of time until I reached low values from the chart, so I haven't had a chance to try even lower values).

dcousens commented 6 years ago

@unsystemizer understandable, the synchronization process is the least useful process to a user though. I use indexd to serve thousands of requests in short time frames, for me, that is why the RPC values are higher by default.

If we can add https://github.com/bitcoinjs/indexd/issues/27, then the higher RPC settings will also directly correlate to increased synchronization performance too.

unsystemizer commented 6 years ago

Sounds good. I expected that "incoming" RPC have (or can have) separate maximums (another "pool") from "back-end" RPCs. If it's all one big pool of RPC resources it's easy to see why one might need to drive a lot of them.

If you have a real-life client RPC log that can be fed to curl to simulate a production environment, let me know and maybe I can test that too (perhaps not immediately, but later as I'd need to run this it on mainnet). Before I was wondering about logging because I thought that we could capture RPC parameters and use a simple (well, the worst case would be a bit of bash) to generate a "replay" script that could be used for regression testing and tuning. Admittedly if there are many requests fired nearly concurrently it wouldn't be easy to drive that log from one client (or distribute that workload properly across more than 1 client).

dcousens commented 6 years ago

Changed to RPCCONCURRENT=16 in example/.env

bitcoinjs / indexd

Consider making downward adjustments to RPC values in .env #29