[Bug] [Combined] Add clarifying error messages for all VMErrors

movekevin commented 1 year ago

@wrwg added the ability to attach an error message to VMErrors so we should go through and add errors messages to all common errors. Community developers have repeatedly run into the errors and could not debug exactly why an error happens, for example: https://github.com/aptos-labs/aptos-core/issues/8078 https://github.com/aptos-labs/aptos-core/issues/6232

Here's a list of the most common errors that are almost impossible to debug without clear error messages (to be expanded more as more are reported):

[ ] #9999
[ ] CONSTRAINT_NOT_SATISFIED
[x] BACKWARD_INCOMPATIBLE_MODULE_UPDATE
[ ] #9998
[x] RESOURCE_DOES_NOT_EXIST (doesn't say which resource doesn't)
[x] RESOURCE_ALREADY_EXISTS (doesn't say which)
[x] ARITHMETIC_ERROR (doesn't say overflow, underflow or division by zero)

brmataptos commented 1 year ago

Is it possible we can also add a feature (at least when running outside the blockchain) to specify details of errors? For example, LINKER_ERROR can say what function is missing. INCOMPATIBLE_TYPES can tell what the two types are. etc. Can we roll that into this bug or should I file a new one?

movekevin commented 1 year ago

Great idea! The error messages cannot contain a lot of details such as stack traces. Let’s create a separate ticket for that though as I’m also not sure which layer the extra details should be generated - vm, api, or cli

brmataptos commented 1 year ago

Here's an example error that might be made into a test case. This is resulting from an inlining bug, but perhaps it can be converted into .mvir. Btw, if you can identify the problem please let me know.

unknown-invariant-violation-error-bytecode.txt

wrwg commented 1 year ago

Not sure what you guys mean, because this feature — adding a more detailed error message — is already there.

brmataptos commented 1 year ago

So..... this error is very clear and detailed?

task 1 'run'. lines 93-93:
Error: Function execution failed with VMError: {
    major_status: UNKNOWN_INVARIANT_VIOLATION_ERROR,
    sub_status: Some(2),
    location: 0xcafe::vectors,
    indices: [],
    offsets: [(FunctionDefinitionIndex(0), 27)],
}

wrwg commented 1 year ago

This bug is about adding more detailed error messages, by exploiting the message field (or similar, forgot name) which I ensured it is actually supported through the levels of the stack a while ago. What I'm wondering is simply what else you are asking to do and want to open another bug for? In my understanding this is already covered by this bug.

Note that if there is no detail message, there is no printout of it, so perhaps this is confusing here.

movekevin commented 1 year ago

This error message cannot be too long though and thus likely wouldn’t contain too much information. There can be a need for a lot lore details such as stack trace, etc that we dont want to do with validators or even fullnodes

wrwg commented 1 year ago

Fields for stack trace are already there, but only populated if the code is compiled for testing IIRC.

I would not recommend to add additional fields but instead put info into the string message. It does not scale to add fields for every special case, and every single time is a breaking change. It also is technically not needed because this is pure user info, not to be processed by code.

One can generate a more detailed message when testing is on. There is no issue with having a larger string in the message field.

brmataptos commented 1 year ago

I was confused at first, but I think @wrwg is saying that this bug should cover very precise error messages, possibly including stack traces, as long as it fits into the message field. I agree, but this may be too big a task for a simple bug.

@runtian-zhou, do you have time/health to work on this this week or do you need help?

runtian-zhou commented 1 year ago

I thought the original issue is that error messages get lost while returning the vm errors back to the users from API? That's why I think the reproduce step could be important.

brmataptos commented 1 year ago

@wrwg seems to think the message doesn't get lost. We just need to make sure we include something useful, at least in testing mode. (I'm not sure how that's determined, though. We could reuse an env variable like RUST_BACKTRACE=1 or something for stack traces, but that might not be desirable for those not on our team.)

brmataptos commented 1 year ago

Note that the detailed error message is deliberately omitted by our test frameworks in the aptos-core tree.

runtian-zhou commented 1 year ago

Starting to create a list of smaller issues to track this.

runtian-zhou commented 1 year ago

Filed #10039 for further action

vgao1996 commented 1 year ago

Probably need to catch up more on this, but just wanna mention that the error code CONSTRAINT_NOT_SATISFIED has been used wrongly. It was originally meant to represent a very specific bytecode verifier error with generic type parameters, but later it was abused for some other stuff. Ideally we should get this fixed in future versions

humb1t commented 1 year ago

can't wait to see it's live!

juliaaschmidt commented 9 months ago

any news on what the CONSTRAINT_NOT_SATISFIED error means when publishing a package with dependencies and to a resource account on a local testnest inside the python SDK?

banool commented 9 months ago

Could you run aptos info and share the output? I'm curious to see what version of the CLI (and therefore local testnet) you're using.

juliaaschmidt commented 9 months ago

There you go:

{
  "Result": {
    "build_branch": "",
    "build_cargo_version": "cargo 1.74.0",
    "build_clean_checkout": "true",
    "build_commit_hash": "",
    "build_is_release_build": "true",
    "build_os": "macos-aarch64",
    "build_pkg_version": "2.3.2",
    "build_profile_name": "cli",
    "build_rust_channel": "",
    "build_rust_version": "rustc 1.74.0 (79e9716c9 2023-11-13) (Homebrew)",
    "build_tag": "",
    "build_time": "2023-11-29 10:20:09 +00:00",
    "build_using_tokio_unstable": "true"
  }
}

I only get it when I publish_package() of a module with multiple dependent modules. Do you have an example python SDK deployment script for that scenario? When I publish a module without any dependent modules, it works fine.

banool commented 9 months ago

For starters could you try CLI 2.4.0? It likely won't fix your issue but you might get better error messages.

As for your actual issue, you likely need to publish the dependencies first, and then your top level module that depends on them.

juliaaschmidt commented 9 months ago

As I'm using mac OS, the guide says to install aptos-cli via homebrew, and that's the latest version on there.

By publishing the dependencies first, do you mean aptos move publish to local environment? I did that. I'm just not sure whether I have to publish each package inside the python script again as well, when I only want to interact with the top-most module?

banool commented 9 months ago

The latest version on homebrew is 2.4.0:

$ aptos info | jq -r .Result.build_pkg_version
2.4.0

$ which aptos
/opt/homebrew/bin/aptos

Try upgrade: brew update && brew upgrade aptos.

By publishing the dependencies first, do you mean aptos move publish to local environment? I did that. I'm just not sure whether I have to publish each package inside the python script again as well, when I only want to interact with the top-most module?

Let's say your module is module A. If your module depends on module B and C (and assuming that these modules are not part of the framework), you have to publish module B and C first before publishing module A. You only have to deploy the modules once. It doesn't matter how you publish, Python or CLI if you want.

If you have more issues with this, please reach out on Discord 😄

Oops, closed the issue by accident.

Alivers commented 2 months ago

Do you want to submit transaction for a maximum of 5000000 Octas at a gas unit price of 100 Octas? [yes/no] >
yes
Transaction submitted: https://explorer.aptoslabs.com/txn/b2c5d08f
{
  "Error": "API error: Unknown error Transaction committed on chain, but failed execution: BACKWARD_INCOMPATIBLE_MODULE_UPDATE"
}

Hi, Why there are no more messages about incompatible information? My command: aptos move publish --profile testnet --included-artifacts none --max-gas 50000

My Aptos Cli version is v4.0.0.

aptos-labs / aptos-core

[Bug] [Combined] Add clarifying error messages for all VMErrors #9580