stellar / rs-soroban-env

Rust environment for Soroban contracts.
Apache License 2.0
59 stars 40 forks source link

Preserve backtraces across secondary error events #1349

Open brson opened 5 months ago

brson commented 5 months ago

Closes #1266

What

This attempts to make the backtraces printed by the SDK while debugging unit tests more useful by presenting the backtrace from the first error raised during a contract call instead of a backtrace from a re-raised error.

The main change here is to add a Host::secondary_error function, and in several places that previously called Host::error, call Host::secondary_error instead. Where Host::error receives a Error, collects diagnostics, and returns a HostError, Host::secondary_error receives a HostError and returns a HostError, collecting new diagnostics, but propagating the backtrace from the input error to the output error.

To make this useful, to handle errors in test contracts TestContractFrame stores an entire HostError, with backtrace, instead of a Error; call_n_internal then reuses that error to call secondary_error and re-raise the original error.

The VM dispatch macros also use secondary_error to preserve the original backtrace while also generating new diagnostic events.

In addition to the unit tests I have manually tested this with the SDK.

Why

Background in https://github.com/stellar/rs-soroban-env/issues/1266

In most of my experiences debugging contract unit tests, fuzz tests in particular, I have found the backtraces produced to not be useful because they are produced too late, having previously thrown away the backtrace from the originating error. As a result I have tended to make ad-hoc customizations to the soroban-env-host crate, like printlns in Host::error to figure out where the original error happened.

Known limitations

Note the ugly impl of Hash for TestContractFrame. This type now contains a Backtrace (and diagnostic events) in the testutils config. This could be hashed, but it would be expensive and system-dependent. From looking at how this hash is used in the trace module, I am suspecting that hashing this backtrace isn't desirable nor needed for correctness.

This has two test cases: one for wasm contracts and one for test contracts. These are brittle for two reasons:

This code introduces several expensive clones, but only in testutils configuration with diagnostics enabled.

It would be possible to preserve more backtraces if HostError was modified to contain a stack of them, but I don't think that would be useful enough to make it worthwile.