New scrypto runtime fuzzer

Hey.

Few weeks ago I had an idea how to make fuzzing involving wasm VM much faster. Recently I finally had some time to create proof of concept of this idea and I did so, the implementation is here - https://github.com/hknio/radixdlt-scrypto/pull/1/files

The idea involves replacing wasm VM with custom fuzz VM which can simulate wasm VM behavior from radix-engine perspective and is 100x faster and way easier to fuzz. It's able to reuse test scenarios from radix-engine-tests and fuzz them. It's just proof of concept that this approach is going to work, full implementation with all the features I would like to have would probably take 2-3 months.

Here are all the details:

Radix Runtime Fuzzer

Fuzzing the radix-engine comes with several challenges:

Operations are complex, making it tough for a fuzzer to predict them accurately.
Complex operations often require more than one transaction, such as publishing a package before calling it.
The current setup doesn't support creating a seed corpus from existing tests.
The WASM VM is slow, and large packages make mutation inefficient, as altering a 300 KB WASM package is time-consuming.

Enhancing VM Fuzzing

Starting with the WASM VM, it seems impractical to fuzz the WASM code due to its slow performance and the minimal likelihood of discovering bugs in the WASM VM itself. There are more effective methods for bug detection. The focus should be on fuzzing the native functions invoked by the WASM VM, as defined in scrypto_runtime.rs, like object_call_module, object_new, key_value_entry_set, etc. However, even if we tailor a fuzzer specifically for these functions, predicting the correct input is immensely challenging.

To address this, I developed fuzz VM capable of simulating every function of the current WASM VM, leveraging data from the existing test suite and transactions. This VM would primarily interact with native functions and execute basic operations, offering speed and better compatibility with a fuzzer.

The implementation of the fuzz VM involves a new proc macro #[radix_runtime_fuzzer] for impl<'y, Y> WasmRuntime for ScryptoRuntime<'y, Y>. This macro logs every call for later execution (ignoring consume_wasm_execution_units). Executing WASM code generates logs like this (example from Faucet lock_fee):

BufferConsume(0)
ActorOpenField(0, 0, 1)
FieldEntryRead(7)
BufferConsume(1)
ObjectCall([88, 139, 206, 247, 236, 58, 35, 219, 237, 217, 10, 150, 63, 146, 74, 220, 69, 63, 14, 11, 217, 66, 236, 194, 29, 141, 169, 173, 229, 73], [103, 101, 116, 95, 97, 109, 111, 117, 110, 116], [92, 33, 0])
BufferConsume(2)
ObjectCall([88, 139, 206, 247, 236, 58, 35, 219, 237, 217, 10, 150, 63, 146, 74, 220, 69, 63, 14, 11, 217, 66, 236, 194, 29, 141, 169, 173, 229, 73], [108, 111, 99, 107, 95, 102, 101, 101], [92, 33, 2, 160, 0, 0, 32, 89, 221, 100, 240, 12, 15, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0])
BufferConsume(3)
FieldEntryWrite(7, [92, 33, 2, 144, 88, 139, 206, 247, 236, 58, 35, 219, 237, 217, 10, 150, 63, 146, 74, 220, 69, 63, 14, 11, 217, 66, 236, 194, 29, 141, 169, 173, 229, 73, 144, 176, 231, 236, 124, 67, 136, 54, 5, 10, 238, 31, 189, 182, 230, 118, 129, 160, 228, 189, 228, 205, 206, 185, 197, 132, 22, 96, 138, 216, 113])
FieldEntryClose(7)
Return([92, 33, 0])

This approach allows replacing a single WASM VM execution by calling native functions directly. This method mirrors WASM code execution (excluding transaction fees). Hence, instead of deploying a ~300 KB WASM package and invoking its functions, we can emulate its behavior with just 1 KB of data. This is significantly faster and simplifies the fuzzer's task of predicting correct ScryptoRuntime calls. The generate-runtime-fuzzer-test-cases.py script was developed to facilitate this process automatically. For instance, test_static_package_address involves the following invokes in the last transaction:

INVOKE 0 
-- (Faucket lock_fee, the same as above)
INVOKE 1
-- BufferConsume(0)
-- ActorGetPackageAddress
-- BufferConsume(1)
-- BlueprintCall([13, 207, 168, 202, 9, 60, 141, 181, 165, 247, 113, 135, 9, 142, 188, 146, 119, 41, 242, 253, 134, 39, 193, 175, 248, 186, 79, 11, 240, 160], [83, 97, 109, 112, 108, 101], [99, 97, 108, 108, 101, 101], [92, 33, 0])
-- BufferConsume(2)
-- Return([92, 33, 0])
INVOKE 2
-- BufferConsume(0)
-- Return([92, 33, 0])

In the radix-engine, an experimental FuzzerInstance and FuzzerEngine were created to simulate the behavior of the WASM VM using logged data. Although not a complete VM, it demonstrates the potential of this approach.

A fully functional fuzz VM would be more sophisticated, handling basic stack operations and implementing the slicing operator to reuse data from function calls in subsequent operations. Based on calls and returns of ScryptoRuntime functions, this could support more complex behaviors while remaining simple enough for a fuzzer.

The radix-runtime-fuzzer includes an example fuzzer using the fuzz VM and data from radix-engine-tests. It processes test scenarios generated by generate-runtime-fuzzer-test-cases.py and fuzzes their invokes.

Additional Enhancements and Ideas

Several improvements can further enhance fuzzing:

Sbor is a bad library for working with seed corpus and Arbitrary is also not optimal. The perfect fuzzing serializarion library should work both ways (serialize/deserialize) to be able to generate seed corpus and it should not be using prefixed size for data structures, it should be using terminator instead.
ScryptoRuntime should use specific types for arguments (eg. NodeId) instead of Vec<u8> and decode later.
Predicting Own/NodeId is challenging. For fuzzing it should be always incrementally generated (1, 2, 3, ...) and the size of NodeId data should be limited to prefix + 2 bytes (at least from serializer perspective)
Transaction NodeId references are a problem. An alternative could be to ignore them or include everything in references.
Fuzzing single transactions is inadequate. The fuzzer in my approach handles multiple transactions but uses a fixed transaction hash for predictability.
CPU usage for fuzzing instrumentation is high, it should be removed from useless places/dependencies but it's not as simple. Optimizing this requires careful manual compilation and linking with rustc.
Data from testnet/mainnet could be used to generate fuzz cases, but it's not easy task to separate few transactions and then execute them correctly.
Current fuzzer can't find logic or some math errors, just panics and timeouts. A more complex solution is needed to find other errors.

Next steps

I would gladly take care of dealing with some problems and developing new open source tools. I am especially interested in creating fully functional fuzz VM and better serialization library for it, but it's a long term project. If you are interested in it we can discuss details privately.

radixdlt / radixdlt-scrypto