Coredump: switch format to Wasm module

xtuc commented 1 year ago

Reuse the Wasm module container to store a coredump. Debugging informations are stored in custom sections and the main memory in the data section.

Note that naming of the custom sections is still work in progress at the moment and that we can add more information in process/thread-info, if/when needed.

xtuc commented 1 year ago

Thanks for the review @dschuff!

I agree about 1.

Partial dumps. Often memory dumps only include part of the process' memory space, but I don't see a way here to tell the difference between a full dump that just includes a lot of 0 values (which wouldn't be included in any segment) and a partial dump

We can specify multiple data segments with their corresponding offset in memory:

(data (i32.const 1) "...10 bytes")
(data (i32.const 100) "...100 bytes")

This also plays well with mulitple memories. As opposed to ELF, Wasm modules don't include a memory mapping table that would help with partial dump or identifying the memory segments. Compilers could emit such a section that coredumps could rely on but this is outside of the scope of coredumps.

Maybe it's only me, but I don't see any reference to partial coredump in the ELF spec. systemd-coredump mentions that coredump can be truncated, which presumably removes some data segments.

I wonder if it makes sense to put the process info or some other custom section as the first section in the binary, to make it easier to identify coredumps

Yes, I agree. The only reason I haven't done this is because it would break my early tooling. I like that ELF coredumps can be identified by reading the first few bytes.

dschuff commented 1 year ago

What I meant about data segments is: In a regular wasm file the memory initialization is the combination of the specified memory size (which implicitly initializes everything to 0) plus the set of data segments that describe just the parts of memory that have nonzero contents. If a dump is a full dump of all memory, it can also be encoded this way. But there's no way to tell the difference between a full dump that is known to contain 0s in part of its memory (which would have no data segment covering that part) and a dump that only includes some parts (where other parts have unknown contents). Maybe that doesn't really matter, and we don't really consider it a problem? Not sure.

fitzgen commented 1 year ago

FWIW, Wizer's memory snapshots are literally Wasm files with data segments for the nonzero ranges (although we have to be careful not to run into implementation limits for number of data segments, and merge near by data segments together when we get close to the limit).

(Aside: I'm interested in this proposal! But I haven't had time to dig in yet, unfortunately. Sorry about that!)

xtuc commented 1 year ago

@dschuff got it now. Coredumps aren't instantiated like regular Wasm files (they use the Wasm binary encoding for generation/decoding convenience). I'll clarifiy that in the Coredump spec.

@fitzgen

(Aside: I'm interested in this proposal! But I haven't had time to dig in yet, unfortunately. Sorry about that!)

Glad to hear, I'm happy to have a video chat if that helps.

xtuc commented 1 year ago

@dschuff could you please merge the PR or you have more questions about memory segments? I also added a global section in the Coredump.

I'm planning to reach out to potential implementers to get more feedback / input.

dschuff commented 1 year ago

Sorry, I didn't mean to hold this up. I think this is fine, I'll merge it.

The way it's currently written suggests that any dump using multiple segments is partial (i.e. incomplete), but that situation is still indistinguishable from a dump that is known to be complete, but is encoded with multiple segments (so that zeros don't need to be written into the image). Practically speaking it may not matter, since a debugging tool might not do anything different. But if e.g. it finds a pointer that points to a missing portion, it might be good to know whether the pointed-to data is expected to be zero or is just missing from the dump.

WebAssembly / tool-conventions

Coredump: switch format to Wasm module #197