llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.33k stars 12.12k forks source link

[llvm-exegesis] DF flag not being restored #116127

Open boomanaiden154 opened 2 weeks ago

boomanaiden154 commented 2 weeks ago

From the SystemV X86 ABI specification:

The direction flag DF in the %rFLAGS register must be clear (set to “forward” direction) on function entry and return.

This needs to be done when we touch rflags at all. Just clearing rflags upon function exit seems like a reasonable enough option as we (for now) do not care about the flag values after the snippet has finished executing.

llvmbot commented 2 weeks ago

@llvm/issue-subscribers-tools-llvm-exegesis

Author: Aiden Grossman (boomanaiden154)

From the SystemV X86 ABI specification: > The direction flag DF in the %rFLAGS register must be clear (set to “forward” direction) on function entry and return. This needs to be done when we touch `rflags` at all. Just clearing `rflags` upon function exit seems like a reasonable enough option as we (for now) do not care about the flag values after the snippet has finished executing.
topperc commented 2 weeks ago

Just clearing rflags upon function exit seems like a reasonable enough option as we (for now) do not care about the flag values after the snippet has finished executing.

There's no direct move to rflags. Are you going to push 0 to the stack and pop it?

boomanaiden154 commented 2 weeks ago

There's no direct move to rflags. Are you going to push 0 to the stack and pop it?

That's what I was thinking of when writing the issue. In subprocess mode we have no guarantee that %rsp actually points to mapped memory though, so that probably won't work. Given that only df is defined in the SystemV X86 ABI, just clearing df before returning if we set eflags/df should be fine.

legrosbuffle commented 1 week ago

Are you going to push 0 to the stack and pop it? That's what I was thinking of when writing the issue.

It's probably easier to just push and pop the proper register value ? Popping 0 would still not abide by the calling convention.

boomanaiden154 commented 1 week ago

It's probably easier to just push and pop the proper register value ? Popping 0 would still not abide by the calling convention.

We don't need to use the stack at all. We should be able to just insert cld instruction where necessary (or just unconditionally include it).

legrosbuffle commented 1 week ago

We don't need to use the stack at all. We should be able to just insert cld instruction where necessary

Ideally we'd leverage prologepilog as much as possible, without having to resort to custom logic to preserve the registers.

In subprocess mode we have no guarantee that %rsp actually points to mapped memory though, so that probably won't work.

But note that currently if RSP is not set to a valid value we won't be able to set EFLAGS as we're currently generating the following code to set it:

   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   c7 04 24 00 00 00 00    movl   $0x0,(%rsp)
   b:   c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)
  12:   00 
  13:   9d                      popf
legrosbuffle commented 1 week ago

We should be able to just insert cld instruction where necessary (or just unconditionally include it).

OK, I misinterpreted the calling convention. Only DF needs to be cleared on exit, so unconditionally including cld sounds like a good option.

"The direction flag DF in the %rFLAGS register must be clear (set to “forward” direction) on function entry and return. Other user flags have no specified role in the standard calling sequence and are not preserved across calls"

boomanaiden154 commented 1 week ago

Ideally we'd leverage prologepilog as much as possible, without having to resort to custom logic to preserve the registers.

Definitely. I need to look into whether or not it makes sense to extend prologepilog to look at df. It seems like something it should handle. Not entirely sure why it doesn't. Maybe it just hasn't come up much in the past?

Ideally we'd leverage prologepilog as much as possible, without having to resort to custom logic to preserve the registers.

Yeah, doing this within exegesis seems to me like a reasonable stop gap before more thorough investigation of the surrounding infrastructure on my end.