Open tomih opened 11 months ago
Are you requesting them with full transaction details or just hashes? Do you have the redis cache enabled?
I'm requesting them with full transactions and don't use redis cache. I just tested requests via ws (eth_getBlockByHash) with full transactions and it is fast as expected.
Did you make the WS request through Dshackle as well? If yes, can you test it again restarting Dshackle before the test, to make sure it doesn't use any internal caching?
Yup, ws requests go through Dshackle. The current test is: I subscribed to newHeads and on update, immediately request eth_getBlockByHash with full transactions. Even If I restart Dschackle, results are very close to direct connection to node. Here are my results:
+--------------------------------------+---------------------+--------------------+---------------------+---------------------+---------------------+-----------+
| Block Number | lb-local | lb-gcloud | node-1 | node-2 | infura | lb-node-1 |
+--------------------------------------+---------------------+--------------------+---------------------+---------------------+---------------------+-----------+
| New Head for block 17813705 | 0.0558011531829834 | 3.7076148986816406 | 0.29482221603393555 | 0.06288886070251465 | 2.518273115158081 | 0.0 |
| Get Block By Hash for block 17813705 | 0.19244098663330078 | 1.8641810417175293 | 0.09901094436645508 | 0.1606767177581787 | 0.12928104400634766 | 0.0 |
+--------------------------------------+---------------------+--------------------+---------------------+---------------------+---------------------+-----------+
lb-local: Dschackle on local machine lb-gcloud: Dschakle on gcloud lb-node-1: Dschackle on same server as node-1 The results are diff in ms from fastest response.
edit:lb-local is the one I restarted before the test.
hm, so, just to confirm your findings. When you requests eth_getBlockByHash(number, true)
though HTTP it's slow, but when you do the same request through WS it shows just a small overhead, right? Do you request a recent block in the both cases?
I suspect that the slow response may happen for an old block, and a request for a recent block may work better. Maybe that the reason of the difference.
PS I see why Dshackle can be slow for this particular request, but I don't understand why WS and HTTP work differently. That's why I'm asking this to better understand where to look at.
Everything you stated is correct. getBlock is slow over http and almost no overhead over ws. And it's also true, that ws test is requesting only the latest block where http test is fetching last 50 blocks.
I will create identical tests for ws and http and let you know. thnx for fast response.
You were right. The ws test was fast, because it fetched cached / latest block. Here are the results, http vs ws:
Block Number HTTP Benchmark (ms) WS Benchmark (ms)
========================================================================================================================
17818823 1707.369089126587 1827.8157711029053
17818822 2408.0188274383545 2526.607036590576
17818821 189.23497200012207 323.7130641937256
17818820 1394.942045211792 1447.16215133667
17818819 677.4783134460449 1110.3830337524414
17818818 260.17189025878906 120.99218368530273
17818817 4065.1087760925293 3823.888063430786
17818816 806.1330318450928 1268.7101364135742
17818815 2133.568048477173 2440.6509399414062
17818814 134.89723205566406 271.1009979248047
You mentioned that you know why get block with txs is slow. Can you elaborate? Are blocks with full tx not cached? Is there a way to overcome that?
I also tried with redis and results are the same.
I had some wrong assumptions for caching this type of requests. Dshackle processes the block and transactions inside it independently. It works fine for the recent blocks, because most of the data in the memory. Then this data gets removed from this "hot data" cache, but the secondary cache doesn't have it because it's unpopular data, so it wastes a lot of time checking each transaction independently. I'll need to rewrite this part to make it fast as it supposed to be.
Is there any ETA for the fix? Or maybe some pointers how to properly fix the problem? I also noticed very high CPU load even when dshackle is idle (no incoming request). I traced it down to scheduler and assume there is a culprit. thnx, T.
I've fixed it in master
.
Do you think you can try with that version? It's not uploaded to docker, but you could build it locally (join our Discord, if you have questions)
btw, how do you measure CPU usage and what is the high load in your case?
Hi. Sorry for late feedback (i was on holidays). I applied the fix but results seem the same:
get full block:
Direct node connection:
1. Current block: 17962783, block_size: ' 273033' bytes, block_time: ' 472.039' ms
2. Current block: 17962782, block_size: ' 41464' bytes, block_time: ' 131.553' ms
3. Current block: 17962781, block_size: ' 62969' bytes, block_time: ' 134.895' ms
4. Current block: 17962780, block_size: ' 157421' bytes, block_time: ' 174.418' ms
5. Current block: 17962779, block_size: ' 217300' bytes, block_time: ' 290.471' ms
Dshackle:
1. Current block: 17962783, block_size: ' 273033' bytes, block_time: '3765.081' ms
2. Current block: 17962782, block_size: ' 41464' bytes, block_time: ' 246.481' ms
3. Current block: 17962781, block_size: ' 62969' bytes, block_time: ' 257.941' ms
4. Current block: 17962780, block_size: ' 157421' bytes, block_time: '1866.621' ms
5. Current block: 17962779, block_size: ' 217300' bytes, block_time: '2581.968' ms
For CPU, I'm just looking @ top (on 16 cpu machine / no requests - idle ):
Having the same issue, specifically for eth_getBlockByNumber" params":["latest",true].
I'm using @MysticRyuujin's dappnode package. @tomih What are you using to benchmark, I'll try it against my instance.
I wrote simple python script: https://gist.github.com/tomih/177c40722b06a4de94aff9fa27376b70 Just change / add nodes to nodes variable. Let me know the results.
I benchmarked dshackle and found that get_block (by number or by hash) and with included transactions is terribly slow.
Here are the average results for last 50 blocks:
Block average response time: dshackle on node: 1526.1953592300415 ms dshackle on local machine: 1431.77649974823 ms node: 135.94377040863037 ms
Other calls, like block.latest, get_transaction_receipt or getBalance return expected results (ie. add few ms latency). Can somebody explain why getBlock is so slow and how to tackle the problem?
edit: get_block call without transactions is fast as expected.
Thnx, Tomi