ethereum / web3.py

A python interface for interacting with the Ethereum blockchain and ecosystem.
http://web3py.readthedocs.io
MIT License
4.95k stars 1.69k forks source link

causes of sudden changes in consumption time #2388

Open suxnju opened 2 years ago

suxnju commented 2 years ago

output

100%|██████████| 500/500 [04:41<00:00,  1.77it/s]
total time cost 281.79

image

What was wrong?

WSADDRESS = 'ws://xxx.xx.xxx.xx:8545' w3 = Web3(Web3.WebsocketProvider(WSADDRESS, websocket_timeout=60))

address = "0xa8f9c7ff9f605f401bde6659fd18d9a0d0a802c5" sample_txs = sampling_transactions(address, use_sampling=True, sample_count=500) # sample first 500 transactions

caller_tracer.js is from https://github.com/ethereum/go-ethereum/blob/master/eth/tracers/js/internal/tracers/call_tracer_legacy.js

with open('./caller_tracer.js', 'r') as f: callTracer = f.read()

def trace_transaction(tx): return w3.provider.make_request('debug_traceTransaction', [ tx, { 'tracer': callTracer, 'timeout': '60000s' } ])

time_used = [] ts = time.time() for tx in tqdm(sample_txs): s = time.time() trace_transaction(tx['transaction_hash']) e = time.time() time_used.append(e-s) te = time.time() print("total time cost %.2f"%(te-ts))

draw transaction index vs time_used

draw transaction index vs accumulated time



* What type of node you were connecting to.
A node on a different computer (full sync), so I have to use Websockets.

**I want to know the reason and how to fix it, thanks a lot**
wiseaidev commented 2 years ago

Apparently, what I think right off the bat, that's because the block capacity is ~ 400 transactions, and it takes time to mine that block(which takes approx 10 to 14 sec). Presumably, as you may know, to speed things up, you need to adjust the gas fees a little bit higher or increase the block size(the main parameter for a bottleneck in a private network). I am not an expert in this field, but it seems like the reason for it. Besides, I think it would be great to extend your experimentation to something like 16k transactions to extract something even more insightful.

suxnju commented 2 years ago

Apparently, what I think right off the bat, that's because the block capacity is ~ 400 transactions, and it takes time to mine that block(which takes approx 10 to 14 sec). Presumably, as you may know, to speed things up, you need to adjust the gas fees a little bit higher or increase the block size(the main parameter for a bottleneck in a private network). I am not an expert in this field, but it seems like the reason for it. Besides, I think it would be great to extend your experimentation to something like 16k transactions to extract something even more insightful.

Thank you for your reply!

But in fact, the transactions here are the first 500 transactions sent to the same contract (0xa8f9c7ff9f605f401bde6659fd18d9a0d0a802c5), and these transactions are not all in the same block, rather, out of the first 500 transactions, only a few are in the same block (https://etherscan.io/txs?a=0xa8f9c7ff9f605f401bde6659fd18d9a0d0a802c5&p=40).

And in fact, the break point is not fixed. I am doubting the problem of connecting nodes, but I don't know how to fix it.

The connecting node use to enable WebSocket access. geth --ws --ws.origins "*" --ws.addr xxx --ws.port "8545" --datadir xxx --maxpeers 0

wiseaidev commented 2 years ago

A node on a different computer (full sync), so I have to use Websockets.

Because it is full sync, the problem seems to be related to your current hardware setup, such as RAM, CPU, SSD used for running that node. As far as I know, nowadays, it is hard and painful as heck to fully sync with the main chain because of how extensive the network is. I think the most advisable option for running a full node is using an SSD(> 1 TB 500 GB ) in terms of drives. However, if you are using an HDD, the node will not be able to keep up with the latest transactions cause of I/O bound, and it would take a lot of time to read/write the state of a block onto the disk. In today's world, a light node seems to be the best choice if you have an HDD. In terms of rams, you need to check how much ram the node consumes(e.g., I have experienced a node that consumes a solid 16 GB of RAM). If you are limited to your current amount, try to increase the swap space in case your machine runs out of RAM, but it would not make any difference if you are using an HDD, I guess.

I hope the above info will help you in one way or another.

suxnju commented 2 years ago

A node on a different computer (full sync), so I have to use Websockets.

Because it is full sync, the problem seems to be related to your current hardware setup, such as RAM, CPU, SSD used for running that node. As far as I know, nowadays, it is hard and painful as heck to fully sync with the main chain because of how extensive the network is. I think the most advisable option for running a full node is using an SSD(> 500 GB) in terms of drives. However, if you are using an HDD, the node will not be able to keep up with the latest transactions cause of I/O bound, and it would take a lot of time to read/write the state of a block onto the disk. In today's world, a light node seems to be the best choice if you have an HDD. In terms of rams, you need to check how much ram the node consumes(e.g., I have experienced a node that consumes a solid 16 GB of RAM). If you are limited to your current amount, try to increase the swap space in case your machine runs out of RAM, but it would not make any difference if you are using an HDD, I guess.

I hope the above info will help you in one way or another.

Thank you again for your reply.

In fact, I choose full sync for using trace information (by Debug_TraceTransaction). I refer to the official suggestion (https://geth.ethereum.org/docs/faq) and use SSD (≈8TB) and now it has >500GB left (about 1200w block).

But all the data is stored on a laptop, its performance (e.g., CPU, RAM) may affect its efficiency. I will do some experiments and reply later, thanks again!