kadena-io / chainweb-node

Chainweb: A Proof-of-Work Parallel-Chain Architecture for Massive Throughput
https://docs.kadena.io/basics/whitepapers/overview
BSD 3-Clause "New" or "Revised" License
250 stars 91 forks source link

Chainweb Transaction Times #1895

Open masch1na opened 3 months ago

masch1na commented 3 months ago

Dear team,

We already know that mempool transaction propagation is not optimal, but when I brought it to your attention the problem was dismissed as "not a problem" basically - https://github.com/kadena-io/chainweb-node/issues/1622

However, we are at a point where users and devs are not happy with the overall transaction times and it's not getting better and we are losing devs and users as a result of that. I am trying to see if there's anyone within Kadena team who can agree with me that waiting almost 3 minutes for a transaction to be mined on a PoW network with ~30s block time is not optimal, not okay and shouldn't be dismissed as "working as intended".

I concluded a test of sending 50 transactions to a single node and 50 transactions to multiple nodes to see how fast it would take for each of those transactions to be mined and finalized. Full testing results can be found here - https://pastebin.com/BRqrjbUf

TLDR: Single node: Minimum time: 32s Maximum time: 2m44s Average time over 50 attempts: 1m30s

Multiple nodes: Minimum time: 24s Maximum time: 2m12s Average time over 50 attempts: 57s

The pastebin above clearly shows that sending to multiple nodes does lower the time for tx to be mined, specifically around 50% on average.

When sending transaction to multiple nodes, other nodes (to which I didn't send the tx to) seem to be able to pick up that tx quite fast (within couple seconds), since there are multiple nodes trying to send that update to them. Regardless of the fact that almost all, if not all nodes on the network have my tx in the mempool, in some instances it still took up to 2 minutes for the transaction to be finally mined.

Even though mempool propagation isn't the best (when sending to single node, it really takes a long time for any other node to finally see the tx, specifically ~60s), even by bypassing this issue and sending to multiple nodes, the results of how long it took for a transaction to be mined wasn't satisfactory.

I am trying to see if anyone from the Kadena team can agree with the above findings and work on figuring out how can we bring down transaction times. No user likes waiting 3 minutes for a transaction, regardless of how low the gas fee is.

There was an idea going around that it's the mining nodes which are delayed, which is why I listed miners in the pastebin above. It seems that miners who got one of the fastest txs, were also getting one of the slowest txs, so I don't think there's any specific mining pool slowing the network down.

Imagine all nodes on the network have the transaction in the mempool within few seconds, yet it still takes 2 minutes for it to be mined. My only idea why right now is that all mining pool nodes are hidden behind firewall or a proxy and are not reachable from the outside world. I think the delay could be there, but I really need someone from Kadena team to think about this as well. If the delay really is there, maybe default values of how fast the nodes asks other nodes for update can be tweaked.

Regardless, I would really love to see an improvement in transaction speed. If we could get all txs to ~30s which would correspond with our block time, that would be really nice. Even though we have ~30s blocktime, in the end it doesn't even matter , because transactions are just pending in the mempool, 2-3 blocks will get mined empty and then finally the tx gets mined, rendering the ~30s block time useless and that just doesn't sit well with me for the blockchain of the future.

Thank you for taking a look.

edmundnoble commented 3 months ago

Hi @masch1na. Let me just say that we hear you, and we want transactions to get into blocks as fast as possible. Seeing as we've already done a lot of managing expectations in the other ticket re: distributed networks and PoW, I'll explain some other issues that are slowing down our block time, issues for which we have a fix in the pipeline.

As Lars mentioned on the last ticket, the miner finishes mining whichever block it's mining before it consults the mempool again to include new transactions. That's a shame; it would seemingly be cheap for us to notify the miner that we've added new transactions to a block and ask it to start work on that new, refreshed version of the block. We actually have an allowance for that in the chainweb stratum protocol. Miners receiving a Notify message with cleanJob set to False will replace the block they're mining with a new one from the node.

There's another layer here though because of the web of chains. A chain is mineable if both a) it has a block ready and b) all of its adjacent chains are sufficient caught up to it. We prepare a block on each chain in advance, even before that chain is mineable, to avoid forcing the miner to wait for the block to be prepared when it eventually asks for work. Because chains are expected to get a bit ahead of each other occasionally (though not too far, because of the braiding), this can result in a chain having a cached block which is quite old by the time that chain is mineable, leading to a bigger transaction delay than you might expect.

So the delays we can improve here are both the delay it takes for a miner to mine the block, and the delay that it takes between a prepared block being cached and that block starting to mine. The first change we're working on refreshes cached blocks periodically, before the miner starts mining them. That's tracked by #1891. To do this it includes some internal changes that make it possible to continue blocks that have been finished rather than re-execute them from scratch.

A follow-up change is to start notifying miners through the mining update stream when a block has been refreshed, which signals a miner to start work on the new block immediately. This may also require some work in chainweb-mining-client.

We also still have it on our agenda to tackle mempool improvements as I mentioned before, it's just trickier.

masch1na commented 3 months ago

Hey @edmundnoble thanks a lot for that reply. I am genuinely happy to hear that you were able to find out delays which are adding to the transaction time and that there is room for improvement. All of the issues you mentioned make sense to me. Well, with all that I can just say thank you and please don't stop working on this because its truly a difference whether a transaction takes 50seconds or 5 min (I did have 5 min txs too, just not in the test I performed for the purpose of this report and I wanted to be transparent).

So keep it up and thank you for acknowledgement and optimistic comment.

We will see what the transaction times look like after both improvements are implemented and go from there.