radixdlt / radixdlt-scrypto

Scrypto is the asset-oriented smart contract programming language of the Radix network. It allows you to quickly build secure and composable dApps.
https://developers.radixdlt.com/
Other
401 stars 118 forks source link

New scrypto runtime fuzzer #1710

Open bbarwik opened 4 months ago

bbarwik commented 4 months ago

Hey.

Few weeks ago I had an idea how to make fuzzing involving wasm VM much faster. Recently I finally had some time to create proof of concept of this idea and I did so, the implementation is here - https://github.com/hknio/radixdlt-scrypto/pull/1/files

The idea involves replacing wasm VM with custom fuzz VM which can simulate wasm VM behavior from radix-engine perspective and is 100x faster and way easier to fuzz. It's able to reuse test scenarios from radix-engine-tests and fuzz them. It's just proof of concept that this approach is going to work, full implementation with all the features I would like to have would probably take 2-3 months.

Here are all the details:

Radix Runtime Fuzzer

Fuzzing the radix-engine comes with several challenges:

Enhancing VM Fuzzing

Starting with the WASM VM, it seems impractical to fuzz the WASM code due to its slow performance and the minimal likelihood of discovering bugs in the WASM VM itself. There are more effective methods for bug detection. The focus should be on fuzzing the native functions invoked by the WASM VM, as defined in scrypto_runtime.rs, like object_call_module, object_new, key_value_entry_set, etc. However, even if we tailor a fuzzer specifically for these functions, predicting the correct input is immensely challenging.

To address this, I developed fuzz VM capable of simulating every function of the current WASM VM, leveraging data from the existing test suite and transactions. This VM would primarily interact with native functions and execute basic operations, offering speed and better compatibility with a fuzzer.

The implementation of the fuzz VM involves a new proc macro #[radix_runtime_fuzzer] for impl<'y, Y> WasmRuntime for ScryptoRuntime<'y, Y>. This macro logs every call for later execution (ignoring consume_wasm_execution_units). Executing WASM code generates logs like this (example from Faucet lock_fee):

BufferConsume(0)
ActorOpenField(0, 0, 1)
FieldEntryRead(7)
BufferConsume(1)
ObjectCall([88, 139, 206, 247, 236, 58, 35, 219, 237, 217, 10, 150, 63, 146, 74, 220, 69, 63, 14, 11, 217, 66, 236, 194, 29, 141, 169, 173, 229, 73], [103, 101, 116, 95, 97, 109, 111, 117, 110, 116], [92, 33, 0])
BufferConsume(2)
ObjectCall([88, 139, 206, 247, 236, 58, 35, 219, 237, 217, 10, 150, 63, 146, 74, 220, 69, 63, 14, 11, 217, 66, 236, 194, 29, 141, 169, 173, 229, 73], [108, 111, 99, 107, 95, 102, 101, 101], [92, 33, 2, 160, 0, 0, 32, 89, 221, 100, 240, 12, 15, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0])
BufferConsume(3)
FieldEntryWrite(7, [92, 33, 2, 144, 88, 139, 206, 247, 236, 58, 35, 219, 237, 217, 10, 150, 63, 146, 74, 220, 69, 63, 14, 11, 217, 66, 236, 194, 29, 141, 169, 173, 229, 73, 144, 176, 231, 236, 124, 67, 136, 54, 5, 10, 238, 31, 189, 182, 230, 118, 129, 160, 228, 189, 228, 205, 206, 185, 197, 132, 22, 96, 138, 216, 113])
FieldEntryClose(7)
Return([92, 33, 0])

This approach allows replacing a single WASM VM execution by calling native functions directly. This method mirrors WASM code execution (excluding transaction fees). Hence, instead of deploying a ~300 KB WASM package and invoking its functions, we can emulate its behavior with just 1 KB of data. This is significantly faster and simplifies the fuzzer's task of predicting correct ScryptoRuntime calls. The generate-runtime-fuzzer-test-cases.py script was developed to facilitate this process automatically. For instance, test_static_package_address involves the following invokes in the last transaction:

INVOKE 0 
-- (Faucket lock_fee, the same as above)
INVOKE 1
-- BufferConsume(0)
-- ActorGetPackageAddress
-- BufferConsume(1)
-- BlueprintCall([13, 207, 168, 202, 9, 60, 141, 181, 165, 247, 113, 135, 9, 142, 188, 146, 119, 41, 242, 253, 134, 39, 193, 175, 248, 186, 79, 11, 240, 160], [83, 97, 109, 112, 108, 101], [99, 97, 108, 108, 101, 101], [92, 33, 0])
-- BufferConsume(2)
-- Return([92, 33, 0])
INVOKE 2
-- BufferConsume(0)
-- Return([92, 33, 0])

In the radix-engine, an experimental FuzzerInstance and FuzzerEngine were created to simulate the behavior of the WASM VM using logged data. Although not a complete VM, it demonstrates the potential of this approach.

A fully functional fuzz VM would be more sophisticated, handling basic stack operations and implementing the slicing operator to reuse data from function calls in subsequent operations. Based on calls and returns of ScryptoRuntime functions, this could support more complex behaviors while remaining simple enough for a fuzzer.

The radix-runtime-fuzzer includes an example fuzzer using the fuzz VM and data from radix-engine-tests. It processes test scenarios generated by generate-runtime-fuzzer-test-cases.py and fuzzes their invokes.

Additional Enhancements and Ideas

Several improvements can further enhance fuzzing:

Next steps

I would gladly take care of dealing with some problems and developing new open source tools. I am especially interested in creating fully functional fuzz VM and better serialization library for it, but it's a long term project. If you are interested in it we can discuss details privately.

bbarwik commented 4 months ago

I also experimented with two more things:

I uploaded my experiments here - https://github.com/hknio/radixdlt-scrypto/pull/2/files It's a dirty code, but it easily shows ideas I used to make it work, they can be implemented in better way