WebAssembly / wabt

The WebAssembly Binary Toolkit
Apache License 2.0
6.91k stars 702 forks source link

wasm-decompile: add function index comments #2482

Closed python273 closed 1 month ago

python273 commented 1 month ago

You still have to cross-reference with other tools, all of which show function indexes: Chromium debugger, .wat, Cetus. So it's easier to search for functions in text editors when there's a numeric index.

sbc100 commented 1 month ago

You still have to cross-reference with other tools, all of which show function indexes: Chromium debugger, .wat, Cetus. So it's easier to search for functions in text editors when there's a numeric index.

But doesn't the chromium debugger and the wat format both use function names (from the name section)?

Also, I assume in the case when there is no name in the name section the decompiler will use the function index as the function name?

python273 commented 1 month ago

I'm new to WASM, so not sure in which cases names are available. Here's how most functions look in a random WASM game (Unity, Emscripten):

Chromium / Firefox devtools: (func $func704 (param $var0 i32) (param $var1 i32)

.wat (func (;704;) (type 4) (param i32 i32)

wasm-decompile: function f_caa(a:int, b:int) {

The names are only shown for imported/exported functions: (func $stackRestore (;415;) (export "stackRestore") (param $var0 i32) (func $env._glClientWaitSync (;411;) (import "env" "_glClientWaitSync") (param i32 i32 i32 i32) (result i32))

So in the end you need numerical indexes to navigate.

sbc100 commented 1 month ago

I'm new to WASM, so not sure in which cases names are available. Here's how most functions look in a random WASM game (Unity, Emscripten):

Chromium / Firefox devtools: (func $func704 (param $var0 i32) (param $var1 i32)

.wat (func (;704;) (type 4) (param i32 i32)

If you build you module with a name section I think you will see the actual names here.

wasm-decompile: function f_caa(a:int, b:int) {

Ah... where is caa coming from here? It seems like this should be f_703 in the absence of a better/real name. Perhaps we should fix that?

The names are only shown for imported/exported functions: (func $stackRestore (;415;) (export "stackRestore") (param $var0 i32) (func $env._glClientWaitSync (;411;) (import "env" "_glClientWaitSync") (param i32 i32 i32 i32) (result i32))

So in the end you need numerical indexes to navigate.

import/exported names are just one way to give a function a name. The better way to to use the names section.

sbc100 commented 1 month ago

BTW, this change LGTM, with comments resolved.

python273 commented 1 month ago

If you build you module with a name section I think you will see the actual names here.

It's not my module and I don't have the source code :)

Ah... where is caa coming from here? It seems like this should be f_703 in the absence of a better/real name. Perhaps we should fix that?

So I tried adding a cmd arg for NameOpts::None (#2477), but it turns it off for other things like variables or globals, making the output way less readable, as it becomes a mess of numbers. Maybe adding an option to turn it off only for function names might be nice, though alphabetic names seem better than 5 digit numbers, so not sure, opted for comments.

sbc100 commented 1 month ago

If you build you module with a name section I think you will see the actual names here.

It's not my module and I don't have the source code :)

In those cases I would hope the decompiler would put the function index into the name, not only in a comment.

Ah... where is caa coming from here? It seems like this should be f_703 in the absence of a better/real name. Perhaps we should fix that?

So I tried adding a cmd arg for NameOpts::None (#2477), but it turns it off for other things like variables or globals, making the output way less readable, as it becomes a mess of numbers.

So is caa just a made up string? Is that somehow less messy than using a number?

Maybe adding an option to turn it off only for function names might be nice, though alphabetic names seem better than 5 digit numbers, so not sure, opted for comments.

python273 commented 1 month ago

So is caa just a made up string?

It's generated from an index, so 704 = caa, 25 = z, 26 = aa, 27 = ba:

https://github.com/WebAssembly/wabt/blob/b15a13ef84b5a12aadeb1200521a89a2c95dce19/include/wabt/generate-names.h#L33-L42

Is that somehow less messy than using a number?

Just slightly easier to read/remember f_ficb instead of f37419. But yes, it'd be a good option to add, my main problem was finding the functions having an index from debugger, so a comment works well.

sbc100 commented 1 month ago

It's generated from an index, so 704 = caa, 25 = z, 26 = aa, 27 = ba:

Interesting! That seems like pointless obfuscation to me... why not put the useful information in the name?