FuelLabs / fuel-indexer

🗃 The Fuel indexer is a standalone service that can be used to index various components of the Fuel blockchain.
https://docs.fuel.network/docs/indexer/
140 stars 66 forks source link

enhancement: WASM error codes, early exit on error or kill switch #1337

Closed lostman closed 1 year ago

lostman commented 1 year ago

Description

UPDATE: in addition to adding error codes to host functions as described below, this PR adds another host function:

fn early_exit(err_code: WasmIndexerError) -> !

Using this function we can remove more unwrap or expect calls.

Building on https://github.com/FuelLabs/fuel-indexer/pull/1293.

This PR follows the example in wasmer repository:

https://github.com/wasmerio/wasmer/blob/master/examples/early_exit.rs#L60

    fn early_exit() -> Result<(), ExitCode> {
        // This is where it happens.
        Err(ExitCode(1))
    }

Note that the return value is Result<(), ExitCode> while the function is imported in WASM as

https://github.com/wasmerio/wasmer/blob/master/examples/early_exit.rs#L41

  (type $early_exit_t (func (param) (result)))

That is, as if it were fn early_exit();.

This PR does the same for our FFI functions. For instance:

fn put_object(
    mut env: FunctionEnvMut<IndexEnv>,
    type_id: i64,
    ptr: u32,
    len: u32,
) -> Result<(), WasmIndexerError> { }

While their types—as declared to the WASM module which will use them—doesn't mention the Result value:

https://github.com/FuelLabs/fuel-indexer/pull/1337/files#diff-447e55cb0fae7d02ac8fd8f7de5a521cbf60d3f2422b5166c39faeb9f995bea6R21-R29

The one difference between how things are handled in early_exit example versus this PR is getting the error code out:

https://github.com/wasmerio/wasmer/blob/master/examples/early_exit.rs#L93

When I tried the same method:

Err(e) => match e.downcast::<ExitCode>()

I got no error out of it. However, the error was printed as RuntimeError(User(EarlyExit)), so the error was handled correctly. (User indicates that a user-defined error triggered a WASM trap).

So, I ended up handling the error this way:

https://github.com/FuelLabs/fuel-indexer/pull/1337/files#diff-fa722c66d2896c684e5a26aa2c56b7ec1734888609fbb64ecbc7dfd6a420c4ffR933

            } else {
                if let Some(e) = e
                    .source()
                    .and_then(|e| e.downcast_ref::<WasmIndexerError>())
                {
                    error!("Indexer({uid}) WASM execution failed: {e}.");
                } else {
                    error!("Indexer({uid}) WASM execution failed: {e:?}.");
                };
                self.db.lock().await.revert_transaction().await?;
                return Err(IndexerError::from(e));
            }

I am unsure why I couldn't downcast the error the same way as was shown in the example.

Testing steps

Manual testing:

  1. Start an indexer.
cargo run -p fuel-indexer -- run --fuel-node-host beta-4.fuel.network --fuel-node-port 80 --replace-indexer --manifest examples/fuel-explorer/fuel-explorer/fuel_explorer.manifest.yaml
  1. Redeploy:
cargo run -p forc-index -- deploy --path examples/fuel-explorer/fuel-explorer --replace-indexer

Expected output:

2023-09-07T09:12:49.663539Z  INFO fuel_indexer::service: 399: Resuming Indexer(fuellabs.explorer) from block 1240
2023-09-07T09:12:54.321165Z  INFO fuel_indexer::database: 206: Database loading schema for Indexer(fuellabs.explorer) with Version(143093c6bbe3a8937f4fb71514cbe6799266c44398866dbbd23e12b977bc1641).
2023-09-07T09:12:54.346555Z  INFO fuel_indexer::executor: 110: Indexer(fuellabs.explorer) subscribing to Fuel node at beta-4.fuel.network:80
2023-09-07T09:12:54.346681Z  WARN fuel_indexer::executor: 117: No end_block specified in manifest. Indexer will run forever.
2023-09-07T09:12:54.346717Z  INFO fuel_indexer::service: 335: Indexer(fuellabs.explorer) was replaced. Stopping previous version of Indexer(fuellabs.explorer).
2023-09-07T09:12:54.347376Z ERROR fuel_indexer::executor: 941: Indexer(fuellabs.explorer) WASM execution failed: Kill switch has been triggered.
2023-09-07T09:12:54.347819Z  INFO fuel_indexer::executor: 199: Kill switch flipped, stopping Indexer(fuellabs.explorer). <('.')>

In particular, this line indicates that an early termination happened due to the kill switch:

2023-09-07T09:12:54.347376Z ERROR fuel_indexer::executor: 941: Indexer(fuellabs.explorer) WASM execution failed: Kill switch has been triggered.

Changelog

ra0x3 commented 1 year ago

@lostman

lostman commented 1 year ago

@lostman What's the difference between this and #1293 ?

@deekerno's PR was a starting point. This PR adds a few things. From changelog:

FFI functions can access the kill switch indicator and exit early when the kill switch has been triggered. (added in this PR) FFI functions can exit early on error, including sqlx database operation errors. (added in this PR) Add WASM error codes (from @deekerno's PR) Get an error code from the call to the indexer's WASM module. (added in this PR)

The most significant difference is how the error codes are returned:

https://github.com/FuelLabs/fuel-indexer/pull/1337/commits/e74645e0d418db7834e8a271ad50edcaaf055624#diff-447e55cb0fae7d02ac8fd8f7de5a521cbf60d3f2422b5166c39faeb9f995bea6L27-L28

In @deekerno's PR, the result is u32 error code. As I explain in the description, in this PR, I follow the early_exit example—the function signature, as declared in the FFI interface mentions no Result type. wasmer terminates the execution immediately if the host function returns an error this way (which is exactly what we want here).

This way, even get_object, which returns u32, can be terminated early:

https://github.com/FuelLabs/fuel-indexer/pull/1337/files#diff-25a226eeb0da9150a8af6c4267e5745445aab8a84cbe942fa13d8a7756246c11R119

It now returns Result<u32, WasmIndexerError>. However, in the FFI interface, it is still -> *mut u8.

ra0x3 commented 1 year ago

@lostman

lostman commented 1 year ago

@ra0x3, agreed. I will add a special-case for the kill switch. That is, after all, the expected behavior.

lostman commented 1 year ago

I added a WasmIndexExecutor test for exit codes. It is a bit crude but covers the basics.

lostman commented 1 year ago

Left some more feedback

  • @lostman I also wouldn't rush this PR
  • We do have a deadline, but this work should be able to make it in comfortably before that date, with plenty of time to spare
  • This is an extremely sensitive PR, so let's not rush it

I'd rather merge this now (very soon) and have it battle-tested for a couple of weeks than merge it just before the deadline and have no time to resolve any issues that may arise.

I need this to finish #1150 and again, I'd rather test both for a couple of weeks before the deadline is upon us.

lostman commented 1 year ago

@ra0x3, @deekerno,

I added an early_exit function and tried using it in handler_block_wasm:

https://github.com/FuelLabs/fuel-indexer/commit/adee0f8e08e2a404dbb067173cfc25f1cc0cda8d#diff-7121bb4841992a7a257dbb97c0e60f371c59f2845b4413d5413b7adf779feb5aR22

and some trybuild tests failed to link, complaining about missing ff_early_exit at the linking stage:

https://github.com/FuelLabs/fuel-indexer/actions/runs/6171631718/job/16750296869#step:13:106

All integration tests succeeded, and I could compile and deploy indexers.

Any idea what could be going on?

It would've been a nice use-case for this function.

ra0x3 commented 1 year ago

Any idea what could be going on?

@lostman

lostman commented 1 year ago

@ra0x3, yes, it helps. Thanks!

Just pushed this: https://github.com/FuelLabs/fuel-indexer/pull/1337/commits/46507e8ef0082f6d0becdff4a51bbea7cfb9a2cf

And trybuild tests pass 😄 (locally; waiting on CI)

ra0x3 commented 1 year ago

@lostman Is this mean to be re-reviewed? I do remember leaving a previous review, but I see I'm ping'd for review again. But I also see a lot of the feedback was not implemented (and as well no reason was provided as to why).

lostman commented 1 year ago

@ra0x3, yes, it is meant to be re-reviewed. I can't re-trigger the review request since it is already re-requested.

I implemented your feedback. What is missing?

ra0x3 commented 1 year ago

@lostman The review comments are unresolved, and they aren't labeled as "outdated". For any review comments where you implement the feedback, be sure to resolve the conversation (so it doesn't clutter the review, and look as if it's unresolved). And for anything you didn't implement, just leave a responding comment as to why you chose to to implement the feedback.

lostman commented 1 year ago

Apologies. GitHub UI tricked me a little:

Screenshot 2023-09-15 at 4 31 14 PM

Completely missed these comments. 😂

ra0x3 commented 1 year ago

@lostman CI failing