0chain / zboxcli

A client CLI using GoSDK to interface the blockchain, storage platform, and blobbers (storage providers)
Other
28 stars 23 forks source link

[FEATURE] Orignial error messages being replaced with generic message/ #53

Closed Sriep closed 2 years ago

Sriep commented 3 years ago

When zboxcli commands fail the reason for the error is often replaced with a generic Error creating allocation:transaction_not_found: Transaction was not found on any of the sharders.

This is unhelpful for the user and can result in unnecessary issues being raised https://github.com/0chain/zboxcli/issues/34.

Whatever is causing the 'real reason for an error' to be hidden should be fixed.

shaktals commented 3 years ago

My two cents on the related issue of semantic error codes.

Sriep commented 3 years ago

The overwriting issue seems to be happening in VerifyTransaction https://github.com/0chain/gosdk/blob/master/core/transaction/entity.go#L255

    return nil, common.NewError("transaction_not_found", "Transaction was not found on any of the sharders")
}

We want to return the error causing the transactions to fail, not this generic 'it did not work' error.

Sriep commented 3 years ago

For Rest calls error could be getting lost here in gosdk MakeSCRestAPICall https://github.com/0chain/gosdk/blob/master/zboxcore/zboxutil/http.go#L534

    if rate < consensusThresh {
        err = errors.New("consensus_failed", "consensus failed on sharders")
    }

Whatever gets returned by the REST API Call seems to get overwritten here with a consensus_failed error, obscuring the real problem. this is only for REST API calls, not smart-contract transactions. Should wrap any error here.

Also in 0chain for REST API this line needs checking. This is the last thing 0chian seems to do with the err from the endpoint. Maybe add a test case to handler_test.go that touches this line. The whole Response method smells odd.

shaktals commented 3 years ago

Just to reiterate, it would be very very helpful to review the issue of returning a 400 Bad Request code to API calls that have no data/resources to return. The logs get really cluttered and it makes harder to find real errors in the midst of it. A 200 with empty body or 204 would be much better.

Sriep commented 3 years ago

For transcatios the exact place the error gets lost needs testing but there are a couple of clear possablities. This does not seem to be enough to get the error dispalyed in the callers output.

GenerateBlock's local function txnPorcessor.

        if err := mc.UpdateState(ctx, b, txn); err != nil {
            if debugTxn {
                logging.Logger.Error("generate block (debug transaction) update state",
                    zap.String("txn", txn.Hash), zap.Int32("idx", idx),
                    zap.String("txn_object", datastore.ToJSON(txn).String()),
                    zap.Error(err))
            }
            failedStateCount++
            return false
        }

As you see the err returned by mc.UpdateState gets lost and replaced by a boolean value false. We chould pass this error up the stack until it can be wrapped into a new error. Also as ths is a local function, it might be as simple as setting the global GenerateBlock err object to the UpdateState error rather than creating an if block local varable.

Block.ComputeState

        if err := c.UpdateState(ctx, b, txn); err != nil {
            b.SetStateStatus(StateFailed)
            logging.Logger.Error("compute state - update state failed",
                zap.Int64("round", b.Round),
                zap.String("block", b.Hash),
                zap.String("client_state", util.ToHex(b.ClientStateHash)),
                zap.String("prev_block", b.PrevHash),
                zap.String("prev_client_state", util.ToHex(pb.ClientStateHash)),
                zap.Error(err))
            return common.NewError("state_update_error", "error updating state")
        }

Again the error returned by UpdateState is being lost. Now its replaced by a state_update_error. Clearly we should wrap the UpdateState error rather than just loosing it.

Fixing these seems a necessary first step, but it looks as if the error gets lost multiple times!

Dmdv commented 3 years ago

I found that happens often. @Sriep @guruhubb @NoSkillGuy

Symptoms:

  1. Transaction executes OK.
  2. State computing fails which means that the whole block is discarded and the transaction won't go to shaders

2021-08-15T22:13:52.366Z ERROR block/entity.go:870 compute state - state hash mismatch {"round": 23782, "block": "1569ab643d14187adf5ddc9ad406489b8a84855dc552ad3f33cacdfb73b41b07", "block_size": 4, "changes": 18, "block_state_hash": "5a5a33be5e7fad96808ac62550b5f95acb4e19cd346768374b5fcbe3b70ef2cf", "computed_state_hash": "d4124c346090ca0a6bd71718f2c2abd767872adeededb1ffb0b92821771947e9"}

Testing strategy

Add a function to any smart contract where:

  1. Get root node for all service providers from the state
  2. Add new service provider to root node
  3. Save root node back to state

Affects:

All smart contracts where MPT is being mutated by adding a new service provider with balances.InsertTrieNode function

image

image

cnlangzi commented 3 years ago

https://github.com/0chain/gosdk/blob/master/zboxcore/zboxutil/http.go#L534

other http has been updated for this issue, and getting better performance in parallel requests if it is possible. https://github.com/0chain/gosdk/blob/520f39c0fe70bd33a21c7eff1d269386ac653f86/core/transaction/entity.go#L214

guruhubb commented 3 years ago

@NoSkillGuy @bbist one way to handle this is to send the failed transactions to sharders and have them store them in a new Cassandra table, say failed transactions. If the transaction is not found in the blockchain then the sharder can look up the failed table and send the error response for that transaction. Thoughts?

cnlangzi commented 3 years ago

@NoSkillGuy @bbist one way to handle this is to send the failed transactions to sharders and have them store them in a new Cassandra table, say failed transactions. If the transaction is not found in the blockchain then the sharder can look up the failed table and send the error response for that transaction. Thoughts?

or get error message from miner? https://github.com/0chain/zboxcli/issues/81

guruhubb commented 3 years ago

According to @NoSkillGuy the PR 401 should cover it

bbist commented 3 years ago

According to @NoSkillGuy the PR 401 should cover it

Unfortunately, it doesn't cover blockchain scoped errors at the moment. I've created another PR to resolve this by including failed smartcontract transactions in a block, where only successful transactions were stored before.

shaktals commented 3 years ago

I have no knowledge of this being fixed already, please correct me. Otherwise, I reinstate my core issue with error codes/messages on 0chain.

A 400 Bad Request response is not semantic for a correct request, even if there is no data to return.

We should be returning either:

Should this be posted on the 0chain repo too?