gear-tech / gear

Web3 Ultimate Execution Engine
https://gear-tech.io
GNU General Public License v3.0
231 stars 102 forks source link

Randomization of fuzzer configuration through `arbitrary` crate #3988

Open playX18 opened 1 month ago

playX18 commented 1 month ago

Problem to Solve

At the moment fuzzer config and what we pass to wasm-smith for WASM generation is mostly static, this does not allow us to get fully randomized programs that cover wide range of possible usages of Gear API and potentially lowers the coverage percentage we might actually get.

Possible Solution

Notes

What to not randomize

What's worth randomizing

What to increase

I think it's worth to also calculate allowed gas for each program that we generate based on estimation through config and total instructions in the program so we can run the generated programs until completion more often rather than failing because we exceeded our gas limit.

techraed commented 1 month ago

Thank you, @playX18! Great research been done! :+1:

Some thoughts to share.

Increase amount of gas by calculating theorethically needed amount of gas for each program we generate instead of using constant gas which also should help us run more programs without failure after short period of time. I think it's worth to also calculate allowed gas for each program that we generate based on estimation through config and total instructions in the program so we can run the generated programs until completion more often rather than failing because we exceeded our gas limit.

I guess we are absolutely ok with amount of gas we send right now. Having gas counter exceeded errors is ok, especially when they aren't that much, because the aim is not to have absolutely infallible executions.

max_funcs: almost never results in increasing coverage min_funcs: very similar to max_funcs in terms of what happens to coverage and program amounts.

Absolutely agree, no need to touch that. Maybe make it lower (max/min: 1) in some situations, so we have only syscalls executed. But that requires making lower the range of syscalls inserting into the program.

Control instructions do have a high impact on runtime of a programs and increase coverage significantly

Btw, in theory it looks wrong. You see, without Control instructions the ends up pretty fast as there are no loops and branches, just plain N instructions with syscalls. Actually, a variant with 1) no Control instructions, 2) less than 100 instructions (even less, but you still need some of them so memory related instructions are generated and program memory updates are performed in runtime), 3) 1-2 functions maximum and 4) a proper control of the amount of syscalls to be injected can be the most hopeful for the coverage. Can you check that please?

Random range of SyscallInjectionTypes in StandardGearWasmConfigsBundle

Here are 2 things I'd point. First, I agree with the idea of changing the ranges for syscalls. That's great, but there can't be general approach, when you create semi-random syscalls injection ranges. Some of syscalls can be injected more than others. You should pay attention here to the possible gas consumption. See, in current config we insert 10-15 gr_send syscalls, which is absolutely okay when having 500 instructions and Control instructions enabled, as you have really low chance to execute even 5 of them (can't remember being so lucky, maybe when having loops, but that's a different story). If you lower amount of instructions and functions and leave 10-15 injections of gr_send, or even remove Control instructions, that can cause lots of problems with gas, as you won't have enough gas to invoke all 10-15 injected gr_sends. What I want to point is that choosing the injections range must be done in some context of the amount of max_instructions, Control and min/max_functions for each (bunch) of syscalls. Second, I think would be nice here to change not only the injection rates randomly, but the bunch of syscalls to be injected. I mean, sometimes having some syscalls being injected 0 times (i.e., not injected). Some of them aren't injected more frequently, then the others. Say, you need gr_send in 90% of executions.