golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.63k stars 17.61k forks source link

cmd/objdump: incorrect filename for function #32068

Open josharian opened 5 years ago

josharian commented 5 years ago
$ go tool objdump -s cmplx.Inf ../pkg/darwin_amd64/math/cmplx.a
TEXT math/cmplx.Inf(SB) gofile..$GOROOT/src/math/bits.go
  bits.go:27        0x2e24          90          NOPL            
  isinf.go:19       0x2e25          f20f100500000000    MOVSD_XMM 0(IP), X0 [4:8]R_PCREL:$f64.7ff0000000000000  
  isinf.go:20       0x2e2d          f20f11442408        MOVSD_XMM X0, 0x8(SP)   
  isinf.go:20       0x2e33          f20f11442410        MOVSD_XMM X0, 0x10(SP)  
  isinf.go:20       0x2e39          c3          RET         

Note that cmplx.Inf is listed as being in $GOROOT/src/math/bits.go. It looks like cmd/objdump may assume that the first instruction in the function is located within the function. With inlmarks in a prologue-less function, at least, this assumption is false.

Also, why is that inlining mark there in the first place?

cc @randall77 because this is an unintended consequence of inlmarks cc @dr2chase for all things position

randall77 commented 5 years ago

Before genssa, we have:

v8 (+21) = InlMark <void> [1] v1
v19 (+17) = InlMark <void> [2] v1
v14 (22) = MOVSDconst <float64> [+Inf] : X0

After genssa, we have:

v8 00006 (?) NOP
v19 00007 (+17) XCHGL   AX, AX
v14 00008 (+21) MOVSD   $(+Inf.0), X0

Yes, there's something odd about the inline mark removal which might need fixing. Somehow instead of removing both marks, it merged one into a real instruction (v8->v14) and maybe that forced it to keep the other one (v19)?

The main bug I'm not surprised about. There's no other place to get position information other than the first instruction. Except maybe the return instruction (if there is one)?

josharian commented 5 years ago

Another place to look for position information, at least on amd64, is any trailing (padding) INT $3 instructions.

But changing the heuristic to look first for a RET and then at the first instruction if no RET seems like a good place to start.

joshuastrauss commented 5 years ago

Hello, I'm Joshua and still very new to go.

I was able to get this working in about 8 lines of code. Here is the process I took.

disasm.go Inside of the Print method, which starts on line 186:

  1. Created an additional instance of the tabWriter to handle the top-level line of the output ~line 202.
  2. Removed the call to print the first line as it is now handled later in the function ~line 214.
  3. Added two additional variables to store the last instruction that is a part of the disassembly as well as the file associated with the instruction ~line 226
  4. Inside the callback function passed into the Decode method, set the variables that were declared as a part of item 3 ~line 233.
  5. Directly after the call to Decode, I added some logic to check the value of the last instruction and if “RET”, use the file associated with it or else default to the first instruction and then flush the header and body of the output ~line 260

I used an additional instance of tabWriter due to the calls to Fprintf inside of the callback that is passed into Decode. The enclosing Print method depends on the io.Writer interface, so the changes are more involved than I wanted to get into without first reaching out. Please let me know if the above makes sense or the direction I should be taking for the fix.

josharian commented 5 years ago

Thanks for looking into this!

What is the performance hit if you simply call Decode twice, first just to find the RET (if any), then a second time to do all the printing? (By the way, any RET anywhere in the function should work. It doesn't have to be the final instruction, and it often won't be.)

joshuastrauss commented 5 years ago

This is a better suggestion. It drops the amount of changed lines and preserves the order to the output to match the order of the logic in the code. Thanks for pointing out the RET code could be anywhere in the instructions, I was dialed too much into the example and didn’t think about that.

The one caveat is that I’m not sure how to short-circuit the call to Decode because the function is executing a callback which prevents adding a goto label due to scope?

The execution time is linear, so I don’t think decoding the symbols twice will add that much overhead unless there’s something happening at a lower level that I am not aware of. I’m going to add some profiling code and run a few performance tests on the algorithm as it will be a good learning experience.

josharian commented 5 years ago

Sounds good. Let me know if I can be of help.

As to short circuiting, I think you’ll have to add explicit support for it, e.g. by having the function arg to Decode return a bool indicating whether to stop decoding. You could start without that and add it in as a second step, since it is an optimization.

joshuastrauss commented 5 years ago

I was as able to familiarize myself with pprof this morning.

There is a 2x slow down when invoking the function twice to get the ret code. Adding the short circuit logic and re-running the profile on runtime.a cuts the additional overhead in half, which is probably close to average case. I have a fast laptop, but the results were measurable in seconds, which is enough justification to include the short circuit logic since it’s already written down. I’ll submit these changes and leave it up to code review going forward.

gopherbot commented 5 years ago

Change https://golang.org/cl/182758 mentions this issue: cmd: Correctly report the source file of an assembly

joshuastrauss commented 5 years ago

Thanks for all the help Josh