Open Quuxplusone opened 7 years ago
Bugzilla Link | PR33913 |
Status | NEW |
Importance | P enhancement |
Reported by | Manoj Gupta (manojgupta@google.com) |
Reported on | 2017-07-24 11:59:47 -0700 |
Last modified on | 2017-09-04 03:23:47 -0700 |
Version | trunk |
Hardware | PC Linux |
CC | coby.tayree@intel.com, dvyukov@google.com, efriedma@quicinc.com, glider@google.com, llvm-bugs@lists.llvm.org, rengolin@gmail.com, rnk@google.com, zvirack@gmail.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Consider the following:
void a() {
register int a asm("sp") = 0;
asm volatile("nop":"+r"(a));
}
In this case, both gcc and clang zero out "sp".
If you don't initialize the variable, you're basically asking the compiler to
put uninitialized data into rsp. If you're lucky, the compiler realizes that
putting uninitialized data into rsp is a no-op, and therefore does nothing...
but if you're unlucky, the compiler shoves some other unrelated value into rsp,
and it explodes (which is what is happening here).
I think the right approach here is to propose some well-defined mechanism for
getting the result you want... and then maybe add a hack to clang to map this
particular construct to the same mechanism.
According to what
(Sorry, accidentally sent a truncated message.)
According to what Renato wrote here:
https://lists.linuxfoundation.org/pipermail/llvmlinux/2014-May/000946.html, GCC
doesn't seem to always handle local register variables correctly either (I've
just checked this is also true for x86_64), e.g. it may drop a store to such a
variable.
The easiest way to fix the crashes is to move __sp to the global scope:
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a969ae6..6adc0a7 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -174,11 +174,13 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL),
0ULL, 0UL))
* Clang/LLVM cares about the size of the register, but still wants
* the base register for something that ends up being a pair.
*/
+
+register unsigned long int __sp asm(_ASM_SP);
+
#define get_user(x, ptr) \
({ \
int __ret_gu; \
register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \
- register void *__sp asm(_ASM_SP); \
__chk_user_ptr(ptr); \
might_fault(); \
asm volatile("call __get_user_%P4" \
Actually, the linked thread (https://lkml.org/lkml/2017/7/12/555) contains a deeper analysis by Josh Poimboeuf who also notes that simply making __sp a global variable leads to a kernel .text size regression under GCC.
My reading from that thread is that both clang and gcc treat the __sp variable different and each has its own benefits/problems. Since this is undefined and largely undocumented behaviour, I find it hard to believe either side will be convinced to change.
However, there is one hint in that thread that may bring the final solution. Just add SP directly to the clobber list. It should work on both compilers and have the intended effect without additional movs.
> My reading from that thread is that both clang and gcc treat the __sp variable
> different and each has its own benefits/problems. Since this is undefined and
> largely undocumented behaviour, I find it hard to believe either side will be
> convinced to change.
Agreed.
> However, there is one hint in that thread that may bring the final solution.
> Just add SP directly to the clobber list. It should work on both compilers and
> have the intended effect without additional movs.
Quoting https://lkml.org/lkml/2017/7/19/1144:
"""
> > IIRC, clobbering SP does at least force the stack frame on GCC, though I
> > need to double check that. I can try to work up an official patch in
> > the next week or so (need to do some testing first).
>
> Sounds great.
>
> Thanks again for looking into this and coming up with a solution!
After doing some testing, I don't think this approach is going to work
after all. In addition to forcing the stack frame, it also causes GCC
to add an unnecessary extra instruction to the epilogue of each affected
function:
lea -0x10(%rbp),%rsp
"""
, so a patch that clobbers SP is unlikely to be accepted upstream (although it
makes Clang build work :))
Josh is currently working on a more intrusive kernel patch that's likely to
solve the problem:
https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=ASM_CALL
(In reply to Alexander Potapenko from comment #6)
> After doing some testing, I don't think this approach is going to work
> after all. In addition to forcing the stack frame, it also causes GCC
> to add an unnecessary extra instruction to the epilogue of each affected
> function:
Right, that's not good either. :(
> Josh is currently working on a more intrusive kernel patch that's likely to
> solve the problem:
> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/
> ?h=ASM_CALL
Looks hairy, but mostly mechanical.
Josh uploaded a new patch for this (https://lkml.org/lkml/2017/8/31/513). But
there are some questions raised in particular by Linus Torvalds
(https://lkml.org/lkml/2017/8/31/627):
On Thu, Aug 31, 2017 at 09:11:54AM -0700, Linus Torvalds wrote:
On the whole, I'm not entirely sure this is the right approach. I
think we should
(a) approach clang about their obvious bug (a compiler that clobbers
%rsp because we mark it as in/out is clearly buggy)
(b) ask gcc people if there's some other alternative that would work
with clang as-is rather than the "mark %rsp register as clobbered"
I couldn't actually find the %rsp trick in any docs, I assume it came
from discussions with gcc developers directly. Maybe there is
something else we could do that doesn't upset clang?
Perhaps we can mark the frame pointer as an input, for example? Inputs
also have the advantage that appending to the input list doesn't
change the argument numbering, so we don't need to worry about
numbered arguments (not that I mind the naming of arguments, but I
kind of hate having to do it as part of this series).
Hmm?
Linus
Do we have a short repro for this problem that doesn't require building the whole kernel?