Closed myzhan closed 11 months ago
If there are no debug symbols, it will not work. Normally, you pass -g to the compiler, but I'm not sure how go does it?
The other issue seems to be with xcode. Perhaps it would be better to build it with make or ninja, i.e.,
cmake -G make -DOPENSS...
Does that work?
For the building issue, I switch to make and it's ok. And I think the version of dsymutil does not support go executable, which I installed via brew. Do we need a new go-parser for go executables?
Good, I'll update the build instructions for OSX.
If you comment out the "system("dsymutil...");" stuff in macho-parser.cc, does it work with Go then?
No, it doesn't. Can I reopen the previous issue?
OK, I manage to make it work with Go, and the new engine is as fast as ptrace in linux.
I wrote a tool in Go to help me test the parsing procedure, and found that is_end_seq will make us missing the last address of each compile unit.
package main
import (
"debug/dwarf"
"debug/macho"
"fmt"
"io"
)
func panicIfErr(err error) {
if err != nil {
panic(err)
}
}
func main() {
executable := "/Users/zhanqp/tmp/hello"
exe, err := macho.Open(executable)
panicIfErr(err)
dw, err := exe.DWARF()
panicIfErr(err)
reader := dw.Reader()
for index := 0; ; index++ {
entry, err := reader.Next()
panicIfErr(err)
if entry == nil {
break
}
if entry.Tag != dwarf.TagCompileUnit {
continue
}
lrd, err := dw.LineReader(entry)
panicIfErr(err)
for {
var e dwarf.LineEntry
err := lrd.Next(&e)
if err == io.EOF {
break
}
panicIfErr(err)
fmt.Printf("%s:%d %x\n", e.File.Name, e.Line, e.Address)
}
}
}
Good work!
is_end_seq actually needed for C code, since clang leaves some data for use in functions at the end of the function. Without is_end_seq, kcov sees that as a line of code (since it looks like that in the DWARF info), and tries to set a breakpoint there, thereby corrupting the data.
What is the filetype for the Go binaries? If one can identify them some way, it would be fairly easy to skip is_enq_seq and dsymutil etc for that case.
BTW, I reverted my changes and test a simple C binary, it failed to parse debug symbols.
$ brew install dwarfutils
$ cat hello.c
#include <stdio.h>
int main() {
printf("hello\n");
return 0;
}
$ clang --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: x86_64-apple-darwin22.1.0
Thread model: posix
$ clang -g -O0 hello.c -o hc
$ dsymutil ./hc
warning: (x86_64) /var/folders/d4/8t_88ltn76l44lv65wnx3q9c0000gp/T/hello-19fac8.o unable to open object file: No such file or directory
warning: no debug symbols in executable (-arch x86_64)
What is the filetype for the Go binaries?
The linker of Go stores build info in executable, can will search for this magic string?
If I remember correctly, clang uses a special case when building in one step, i.e., without compile + link. Then I think it might store the debug symbols in the binary directly, otherwise dsymutils will gather everything from the object files.
Perhaps that's how Go does it as well?
Then I think there should be some special-case to catch this. Not quite sure how to determine it though.
Can I look at your stashed changes? I tried to follow your steps (with a hello world compiled directly to an executable), and I just get a segfault when parsing it.
Thanks! I did basically the same changes yesterday, but still with a crash during parsing. However, it looks to me that the hello world binary still doesn't have the DWARF info embedded, so maybe there is still some other way used compared to how Go does it.
So basically your hello world example, compiled in the same way. Is that parseable with kcov, with your commit?
By default, go will compress dwarf sections, you need to disable it when building the executable. Please refer to my previous issue for the command.
Simon Kagstrom @.***>于2023年8月25日 周五00:29写道:
Thanks! I did basically the same changes yesterday, but still with a crash during parsing. However, it looks to me that the hello world binary still doesn't have the DWARF info embedded, so maybe there is still some other way used compared to how Go does it.
So basically your hello world example, compiled in the same way. Is that parseable with kcov, with your commit?
— Reply to this email directly, view it on GitHub https://github.com/SimonKagstrom/kcov/issues/405#issuecomment-1692030762, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHI3NIAS2QP7QQ6YFG5QMDXW56PLANCNFSM6AAAAAA3ZW5XKI . You are receiving this because you authored the thread.Message ID: @.***>
Yes, but I meant your hello world in c
Ah, yes, I get the same segmentation fault when parsing c executable.
The crash is in dwarf_errmsg(). The error is DW_DLV_NO_ENTRY.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
* frame #0: 0x00000001002b927f libdwarf.0.dylib`dwarf_errmsg + 9
frame #1: 0x0000000100047a31 kcov`(anonymous namespace)::MachoParser::parse(this=0x00000001000c5278) at macho-parser.cc:135:13 [opt]
frame #2: 0x00000001000063ff kcov`Collector::run(this=0x00000001004063d0, filename=Summary Unavailable) at collector.cc:46:16 [opt]
frame #3: 0x000000010002db0a kcov`runKcov(runningMode=MODE_COLLECT_AND_REPORT) at main.cc:353:19 [opt]
frame #4: 0x000000010002ae60 kcov`main(argc=3, argv=<unavailable>) at main.cc:611:9 [opt]
frame #5: 0x00007ff812573310 dyld`start + 2432
After some digging, I realize that clang stores dwarf info into .dSYM instread of executable, that's why we can't directly parse the executable generated by clang. Apparently, the Go team doesn't follow the design.
See also:
Yes, I saw the same backtrace myself.
The dSYM stuff is the reason for the dsymutil invocation. The go scheme is more like what e.g., Linux etc do it with ELF, but I'm not sure how the trivial case with the compile-direct-to-executable with clang actually works. lldb is able to debug it, so there is some way, but I'm not sure what...
So far, I can use https://pkg.go.dev/golang.org/x/tools@v0.14.0/cmd/splitdwarf to generate dSYM from go executeble before running kcov. I think the macho parser can check the dSYM is valid, before running dsymutil, nor it will report that "no debug sysbols in executable" and overwrite previous file generate by splitdawrf.
And Go has a proposal to generate dSYM directly by linker, https://github.com/golang/go/issues/62577
BTW, I still need to comment out "m_imageBase" when running go executable.
OK, thanks for the progress!
I haven't looked at the mach-o stuff for a while, so I'm not don't know how to check the validity of the dSYM info. I guess there should be some checksum to match against the binary, but I'm not sure.
I guess we can check for the buildinfo string to identify go binaries, and then special case them. Alternatively, perhaps it's easier to just add a command-line option which does it for now?
E.g., with --parse-go-binary (or finding it directly) it would
I guess that would work both for the splitdwarf variant and the produced-by-linker case?
I prefer to find it directly by search the binary for the buildinfo magic string, which is "\xff Go buildinf:". If you are ok with that, I can submit a PR later.
Yes, absolutely! Must the entire binary be parsed, or is it placed in a known place?
Only for the macho engine, I will try to use the dwarf parser to locate the "__go_buildinfo" section, and search for the magic string in it, just like the Go std.
I have submitted a PR. And I can't reproduce the problem of is_seq_end, so I keep it.
After manually signing the binary, It failed to collect coverage from a go binary.
$ ./src/Release/kcov $HOME/tmp/report $HOME/tmp/hello Picked parser: Mach-O warning: no debug symbols in executable (-arch x86_64)