kadena-io / chainweb-node

Chainweb: A Proof-of-Work Parallel-Chain Architecture for Massive Throughput
https://docs.kadena.io/basics/whitepapers/overview
BSD 3-Clause "New" or "Revised" License
249 stars 92 forks source link

On-chain error reporting should be restored #1575

Closed CryptoPascal31 closed 1 year ago

CryptoPascal31 commented 1 year ago

Since this PR: https://github.com/kadena-io/chainweb-node/pull/1543 on-chain errors are not reported anymore.

I ignore the rational behind, maybe to reduce the block size ? But IMHO this is a big regression.

Using the /local l endpoint can't be an acceptable workaround, because conditions on chain could change between the /local call and the moment where is the transaction is being mined. In case of a /local success and on-chain failure, user must be able to troubleshoot.

Moreover, with some other people as well I give support on Discord. I often explain to people why their transaction failed, giving guidelines to try their transactions again. But now, this becomes impossible.

Post-replaying the failed transaction in /local for troubleshooting is not possible:

Please restore the feature, as soon as possible. It's a barrier to adoption. What should think a new user, when this transaction fails, having no way to understand why ?

masch1na commented 1 year ago

I 100% agree it should be restored for all of the above mentioned reasons. Imagine being a new user on Kadena and your transaction fails, but you have no idea why. Additionally, this basically doesn't even work as explained in my example scenario.

Example scenario: Imagine sending a transaction as a regular user on a DEX. Your transaction fails because of "insufficient output amount" due to not enough liquidity, but by the time you bring up your CLI and use /local there is enough liquidity so your transaction returns success. How using /local helpful in finding out what the original error was?

The transaction I would now send using /local would pass, because there is enough liquidity now, therefore it wouldn't return the original error and so I have no way to know why my original transaction failed, only left to guess, because there is no error message and the on-chain situation has changed.

tuncatunc commented 1 year ago

I’m new to Kadena and every other blockchain has error messages and even logs. Kadena should have it too.

masch1na commented 1 year ago

As predicted, it didn't take long. Someone tried to mint a NFT and his transaction failed for unknown reason and he asked "IDK wtf is happening". This change needs to be reverted asap

image

jwiegley commented 1 year ago

Allowing errors to appear on chain creates an unfortunate sort of dependency, since it makes any change to error messages a “forking change” — due to the fact that all chain data is hashed. There was also a mismatch in certain places between the gas cost of an operation, and the compute time required to generate error messages when that operation failed. For these reasons, errors have been reduced for the time being to a simple “Error” note on-chain. We will gradually be bringing back more concise and useful on-chain error messages.

CryptoPascal31 commented 1 year ago

IMHO At least, the error message emitted by a failed (enforce ..) should be returned. Call stack is less important.

Since (enforce ..) is now lazily evaluated, devs are free to build informative and explicit error messages.

masch1na commented 1 year ago

I'm not following the first reasoning about making changes to error messages - why would anyone with genuine intentions make changes to the error messages? Any interference with the credibility of a chainweb node resulting in different hash and therefore an unsynced node is imo a fair outcome.

I hear you on the gas mismatch though and I am confident you guys will figure it out.

At this moment the user experience is not good and people have no idea why their dex swaps / nft purchases didn't go through.

I am glad to hear that the plan is to gradually bring back more useful on-chain error messages. The blockchain needs them.

jwiegley commented 1 year ago

Hi @masch1na, the sorts of error message changes I mean can happen in Haskell library dependencies of the Pact library. These can change frequently as the library evolves, in ways that are entirely genuine on the author's part. Not everything we use was built for blockchain use, after all.

CryptoPascal31 commented 1 year ago

Thank you @jwiegley for your explanation. I understand your concern to not create a big fork by accident, because of two different library versions.

But I feel like (I maybe wrong.. ) that the type and message fields are not impacted. The first is only a word, and the second is generated by the Pact user code.

It may be possible to restore them pact the March's chainweb version without any threat ? No.

jwiegley commented 1 year ago

It's definitely on our roadmap to bring back error reporting to some degree, so that is something I'll add to our list of considerations as we think about what makes the most sense for users and for maintaining chain history.

emilypi commented 1 year ago

See #1585 - I'm working to improve the situation.

masch1na commented 1 year ago

Hey Emily, thanks for working on improving the situation so fast. This is definitely an useful workaround in the meantime. Being able to rewind tx to a specific block height is great, but still lacking, for example if there are 10 tx's in a block and mine was 5th, I can't simulate my tx behavior properly by specifying just the block height and not my tx's position in the block height. Also, what if during the time of the original tx I didn't have enough money on the chain because of a pending cross-chain transfer...now, while running my simulation I do have enough money so the tx will pass and I am left in a bad spot trying to figure out what happened.

I am aware John said that bringing back error messages is on a roadmap, so I'd just like to stress again that it would be really great if we could simply just see what the error message is on a failed tx instead of having to go through workarounds.

Thank you all for your understanding

emilypi commented 1 year ago

I understand your idea @masch1na - this could be fixed by simulating a block - i.e executing a batch of transactions in local :)

So far, /local has the (largely arbitrary) constraint of being single-tx simulation. If you wanted to simulate the completion of a xchain before executing your next thing, you could do it by structuring your own series of commands within the block. We obviously can't control for indeterminate behavior - who knows if someone else gets a command in between your txs that messes up the workflow, but if you could at least simulate an ideal series of txs without random actor interaction and get what you need. I think this would be doable given our current infrastructure, and probably scopeable for the next release if we agree to it internally.

I think being able to specify a particular position in the block might be a bit of an ask - this is truly random in the network. Blocks are assembled at random depending on the state of the mempool, so I question the utility of specifying a particular block index, but you would certainly be free in the above solution to assemble your blocks as desired.

masch1na commented 1 year ago

Hey Emily! :)

The ability to fully recreate the block and specify my tx position in it so that I can realistically see why the tx failed is a great workaround.

The question is how much of your time and resources do you want to use to build this workaround, as opposed to using these resources to bring back the actual error messages without having to go through workarounds.

I'd like to emphasize again the importance of having error messages included within the tx result.

For example, user was recently complaining that he couldn't add liquidity to Kaddex and posted his failed transaction - https://explorer.chainweb.com/mainnet/txdetail/zaF2KtzklWYhFZDIz3fG62gPfMk4arPy3MeGQnoOMkU - after seeing his tx, I know about the reasons of the failed tx as much as I did without seeing it - the tx report is useless.

There are many many more similar reports.

I am doubtful the general Kadena userbase will go through the hassle of simulating their transaction using /local just to see why the tx failed.

Thank you for all you do.

edmundnoble commented 1 year ago

I have an update here. As of the latest chainweb-node release, version 2.19, on-chain error reporting will return on mainnet at block height 3,774,423 and on testnet at block height 3,299,753, which are projected to occur shortly after June 1. Let us know what you think!

masch1na commented 1 year ago

Looks great so far. Thank you!

jwiegley commented 1 year ago

Closing on that report @masch1na. If you encounter any further problems, please feel free to reopen!