gimli-rs / gimli

A library for reading and writing the DWARF debugging format
https://docs.rs/gimli/
Apache License 2.0
831 stars 105 forks source link

Allow invalid ranges in RngListIter #715

Closed jameysharp closed 3 months ago

jameysharp commented 3 months ago

I thought it would be easiest to describe the issue I'm encountering with a patch, but I'm not attached to this particular solution if folks would prefer some other approach.

I've encountered a toolchain which produces invalid ranges, where the end of the range is before the beginning.

I want to fix that toolchain, of course, but I also noticed that tools like llvm-addr2line and llvm-dwarfdump --lookup apparently silently skip the invalid ranges, so they're able to give partial answers. By contrast, gimli returns an error in this case. Then the addr2line crate, for example, bubbles the error all the way out and doesn't return any results.

It would be useful to be able to skip over the invalid ranges, instead of truncating a range-list at the first error. To that end I've deleted the check here in RngListIter::convert_raw, so the caller can decide what they want to do with such ranges. addr2line, for example, already checks that range.begin < range.end and silently ignores the range otherwise.

The specific toolchain where I've encountered this is TinyGo (which uses LLVM), when building WebAssembly with the wasip1 target. I tested TinyGo 0.32.0-dev and LLVM 17, and this was the shortest program I could find which produces invalid ranges:

package main
import "os"
func main() {
        os.Lstat("some-filename")
}
philipc commented 3 months ago

This seems fine to me, but I'll take the time to reproduce it and see if any other options are apparent.

philipc commented 3 months ago

I couldn't reproduce this with the lastest dev version of tinygo (tinygo version 0.32.0-dev-ad0340d linux/amd64 (using go version go1.22.3 and LLVM version 17.0.1)).

Can you give me a file (email or attach here is fine).

jameysharp commented 3 months ago

Here's the wasm I got from tinygo build -target=wasip1 /tmp/test.go, with tinygo version 0.32.0-dev linux/amd64 (using go version go1.21.9 and LLVM version 17.0.6), same tinygo commit ad0340d4 as you tested.

test.wasm.gz

If you run cargo run -p gimli-examples --bin dwarfdump -- test.wasm you should see several messages saying "Failed to dump attribute value: The end of an address range must not be before the beginning." In the dwarfdump example program this doesn't stop execution, unlike what the addr2line crate does, but it does truncate the range list prematurely. This PR changes the dwarfdump output on this test wasm this way:

18409c18409,18410
<                       [ 2] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 2] <offset-pair 0x00003e9b, 0x00003e7c> [0x00003e9b, 0x00003e7c]
>                       [ 3] <offset-pair 0x00003f17, 0x000040f1> [0x00003f17, 0x000040f1]
18514c18515,18516
<                       [ 1] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 1] <offset-pair 0x00002b15, 0x00002b11> [0x00002b15, 0x00002b11]
>                       [ 2] <offset-pair 0x000041fa, 0x00004216> [0x000041fa, 0x00004216]
28023c28025,28032
<                       [ 0] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 0] <offset-pair 0x0000b2bd, 0x0000b2b8> [0x0000b2bd, 0x0000b2b8]
>                       [ 1] <offset-pair 0x0000b307, 0x0000b304> [0x0000b307, 0x0000b304]
>                       [ 2] <offset-pair 0x00000000, 0x00000001> [0x00000000, 0x00000001]
>                       [ 3] <offset-pair 0x0000b3e4, 0x0000b3df> [0x0000b3e4, 0x0000b3df]
>                       [ 4] <offset-pair 0x0000b44a, 0x0000b445> [0x0000b44a, 0x0000b445]
>                       [ 5] <offset-pair 0x0000b4b0, 0x0000b4ab> [0x0000b4b0, 0x0000b4ab]
>                       [ 6] <offset-pair 0x0000b51e, 0x0000b519> [0x0000b51e, 0x0000b519]
>                       [ 7] <offset-pair 0x0000b56a, 0x0000b571> [0x0000b56a, 0x0000b571]
35919c35928
<                       [ 1] Failed to dump attribute value: The end of an address range must not be before the beginning.
---
>                       [ 1] <offset-pair 0x0000d29c, 0x0000d283> [0x0000d29c, 0x0000d283]
jameysharp commented 3 months ago

In case you're curious: After more investigation we believe these invalid ranges are being introduced by wasm-opt when tinygo runs the "asyncify" pass over LLVM's output, because we no longer see these invalid ranges if we run tinygo with the -scheduler=none flag, which disables asyncify.

There was a similar issue reported at WebAssembly/binaryen#6406, and actually we can reproduce that issue even with -scheduler=none, since tinygo still runs other wasm-opt passes. However, gimli doesn't try to verify that inlined address ranges fall within the address ranges of their parents, so it doesn't complain about that issue.

At any rate, while the producer of this DWARF is clearly buggy, I still think it's useful for gimli to return the address range as provided and let the caller decide if they want to validate it, so I'm still interested in merging this PR or something equivalent.

jameysharp commented 3 months ago

Sure, done!