Closed Sriep closed 2 years ago
The overwriting issue seems to be happening in VerifyTransaction https://github.com/0chain/gosdk/blob/master/core/transaction/entity.go#L255
return nil, common.NewError("transaction_not_found", "Transaction was not found on any of the sharders")
}
We want to return the error causing the transactions to fail, not this generic 'it did not work' error.
For Rest calls error could be getting lost here in gosdk MakeSCRestAPICall
https://github.com/0chain/gosdk/blob/master/zboxcore/zboxutil/http.go#L534
if rate < consensusThresh {
err = errors.New("consensus_failed", "consensus failed on sharders")
}
Whatever gets returned by the REST API Call seems to get overwritten here with a consensus_failed error, obscuring the real problem. this is only for REST API calls, not smart-contract transactions. Should wrap any error here.
Also in 0chain for REST API this line needs checking. This is the last thing 0chian seems to do with the err from the endpoint. Maybe add a test case to handler_test.go that touches this line. The whole Response
method smells odd.
Just to reiterate, it would be very very helpful to review the issue of returning a 400 Bad Request
code to API calls that have no data/resources to return. The logs get really cluttered and it makes harder to find real errors in the midst of it. A 200
with empty body or 204
would be much better.
For transcatios the exact place the error gets lost needs testing but there are a couple of clear possablities. This does not seem to be enough to get the error dispalyed in the callers output.
GenerateBlock's local function txnPorcessor.
if err := mc.UpdateState(ctx, b, txn); err != nil {
if debugTxn {
logging.Logger.Error("generate block (debug transaction) update state",
zap.String("txn", txn.Hash), zap.Int32("idx", idx),
zap.String("txn_object", datastore.ToJSON(txn).String()),
zap.Error(err))
}
failedStateCount++
return false
}
As you see the err
returned by mc.UpdateState
gets lost and replaced by a boolean value false
. We chould pass this error up the stack until it can be wrapped into a new error. Also as ths is a local function, it might be as simple as setting the global GenerateBlock err
object to the UpdateState
error rather than creating an if block
local varable.
if err := c.UpdateState(ctx, b, txn); err != nil {
b.SetStateStatus(StateFailed)
logging.Logger.Error("compute state - update state failed",
zap.Int64("round", b.Round),
zap.String("block", b.Hash),
zap.String("client_state", util.ToHex(b.ClientStateHash)),
zap.String("prev_block", b.PrevHash),
zap.String("prev_client_state", util.ToHex(pb.ClientStateHash)),
zap.Error(err))
return common.NewError("state_update_error", "error updating state")
}
Again the error returned by UpdateState
is being lost. Now its replaced by a state_update_error
. Clearly we should wrap the UpdateState
error rather than just loosing it.
Fixing these seems a necessary first step, but it looks as if the error gets lost multiple times!
I found that happens often. @Sriep @guruhubb @NoSkillGuy
Symptoms:
2021-08-15T22:13:52.366Z ERROR block/entity.go:870 compute state - state hash mismatch {"round": 23782, "block": "1569ab643d14187adf5ddc9ad406489b8a84855dc552ad3f33cacdfb73b41b07", "block_size": 4, "changes": 18, "block_state_hash": "5a5a33be5e7fad96808ac62550b5f95acb4e19cd346768374b5fcbe3b70ef2cf", "computed_state_hash": "d4124c346090ca0a6bd71718f2c2abd767872adeededb1ffb0b92821771947e9"}
Testing strategy
Add a function to any smart contract where:
Affects:
All smart contracts where MPT is being mutated by adding a new service provider with balances.InsertTrieNode
function
https://github.com/0chain/gosdk/blob/master/zboxcore/zboxutil/http.go#L534
other http has been updated for this issue, and getting better performance in parallel requests if it is possible. https://github.com/0chain/gosdk/blob/520f39c0fe70bd33a21c7eff1d269386ac653f86/core/transaction/entity.go#L214
@NoSkillGuy @bbist one way to handle this is to send the failed transactions to sharders and have them store them in a new Cassandra table, say failed transactions. If the transaction is not found in the blockchain then the sharder can look up the failed table and send the error response for that transaction. Thoughts?
@NoSkillGuy @bbist one way to handle this is to send the failed transactions to sharders and have them store them in a new Cassandra table, say failed transactions. If the transaction is not found in the blockchain then the sharder can look up the failed table and send the error response for that transaction. Thoughts?
or get error message from miner? https://github.com/0chain/zboxcli/issues/81
According to @NoSkillGuy the PR 401 should cover it
Unfortunately, it doesn't cover blockchain scoped errors at the moment. I've created another PR to resolve this by including failed smartcontract transactions in a block, where only successful transactions were stored before.
I have no knowledge of this being fixed already, please correct me. Otherwise, I reinstate my core issue with error codes/messages on 0chain.
A 400 Bad Request
response is not semantic for a correct request, even if there is no data to return.
We should be returning either:
204 No content
(my preference)200 Success
but with empty body / no data404 Not found
which works, but it points to a client error, so I don't think it's really correct. And it kind pollutes the logs, specially when you are trying to debug real errors.Should this be posted on the 0chain repo too?
When
zboxcli
commands fail the reason for the error is often replaced with a genericError creating allocation:transaction_not_found: Transaction was not found on any of the sharders
.This is unhelpful for the user and can result in unnecessary issues being raised https://github.com/0chain/zboxcli/issues/34.
Whatever is causing the 'real reason for an error' to be hidden should be fixed.