WebAssembly / tool-conventions

Conventions supporting interoperatibility between tools working with WebAssembly.
Artistic License 2.0
297 stars 65 forks source link

Coredump: add local/stack value typing #199

Closed xtuc closed 1 year ago

xtuc commented 1 year ago

This change adds typing to local and stack values. Technically the type isn't strictly necessary as DWARF should indicate how to interpret the values.

However, using types allows us to represent the lack of value, because it has been optimized out and more complex values like reference types.

Closes https://github.com/WebAssembly/tool-conventions/issues/198

xtuc commented 1 year ago

What is codeoffset relative to?

the start of the code section, as specified by https://yurydelendik.github.io/webassembly-dwarf/#pc. I sometimes found it to be unreliable, I'm using the funcidx as codeoffset for my experiments.

Right now there's a reserved:u32 at the end of a frame but at least elsewhere in the binary encoding for wasm this is typically done with a prefix byte rather than a postfix byte

Yeah that's a good point. It's inconsistent and I'll fix it. Thanks

alexcrichton commented 1 year ago

FWIW I do think that the actual offset within the function is helpful since, as you've linked, it can be used to translate to a filename/line number in a backtrace. If you're unable to get codeoffset working, though, perhaps the funcidx could always be encoded and the offset within-the-function (either specified as relative to the function's start or relative to the code section start) could be an optional field to only get filled in if recoverable?

xtuc commented 1 year ago

@alexcrichton yes, that's actually a good idea; storing both the funcidx and the code offset relative to the function's start, which can be 0 if unknown.

It has the added benefit of allowing to degrade to a backtrace of func indices if you don't have the source module with debug infos, which is not possible with just code offsets.

dschuff commented 1 year ago

code offsets are pretty important for DWARF because DWARF usually needs to refer to instructions rather than functions per se (I actually can't think of any cases off the top of my head where DWARF needs to refer a function abstractly rather than a particular piece of a function, but I could very well be wrong about that). e.g. in backtraces, each location usually points at the exact call instruction rather than at the frame of each function generally.

If the intention is to have information general to the whole function or frame (and the exact path taken from one frame to the next isn't relevant) IMO a funcidx makes more sense.

dschuff commented 1 year ago

oops, I raced with @xtuc I guess the advantage of a code offset is that it's consistent with DWARF (which wants to pretend there's a unified address space and everything is just a pointer), and the advantage of a func + func offset is that you can degrade to having the funcoffset be 0 (in the former case you could also degrade to having a codeoffset pointing to the top of the function, but maybe the latter is easier).

alexcrichton commented 1 year ago

Oh sorry, so to clarify, I don't mind if it's funcidx, code-section offset, or function offset plus function index. I found it odd that it's specified as codeoffset but is being used as a funcidx right now. I don't personally have a horse in this race so don't have a preference.

xtuc commented 1 year ago

got it @alexcrichton.

I like the idea of having both funcidx and codeoffset (relative to the start of the function's code), for the reasons mentioned above.

xtuc commented 1 year ago

@dschuff please merge the change. I'm planning to follow up on funcidx and/or codeoffset in a future change.