Open EdSchouten opened 14 years ago
For this to happen, we first need to get the front-ends using the va_arg instruction. For that to happen, CodeGen needs to fully support the va_arg instruction on all important targets. For that to happen, we need target-independent support for va_arg with aggregate types, and target-dependent support for lowering va_arg for all the important targets.
Patches welcome.
This is aka rdar://7832354
The code sequence is the generic code sequence that is needed if you do a vector of fp va_arg. GCC has an optimization pass that scans a function to see if the va_list provably doesn't escape and if there are no fp accesses.
This impacts stuff like the implementation of the open syscall.
Extended Description
The following code generates very different code when using GCC 4.2.1/Clang SVN. The generated code also makes little sense.
int foo(int a, ...) { int r; __builtin_va_list va;
}
GCC:
0000000000000000:
0: 48 83 ec 60 sub $0x60,%rsp
4: 48 8d 44 24 68 lea 0x68(%rsp),%rax
9: 48 89 74 24 b0 mov %rsi,0xffffffffffffffb0(%rsp)
e: c7 44 24 88 10 00 00 movl $0x10,0xffffffffffffff88(%rsp)
15: 00
16: 48 89 44 24 90 mov %rax,0xffffffffffffff90(%rsp)
1b: 48 8d 44 24 a8 lea 0xffffffffffffffa8(%rsp),%rax
20: 48 89 44 24 98 mov %rax,0xffffffffffffff98(%rsp)
25: 48 83 c0 08 add $0x8,%rax
29: 8b 00 mov (%rax),%eax
2b: 48 83 c4 60 add $0x60,%rsp
2f: c3 retq
Clang:
0000000000000000:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 50 sub $0x50,%rsp
8: 84 c0 test %al,%al
a: 74 26 je 32 <foo+0x32>
c: 0f 29 85 60 ff ff ff movaps %xmm0,0xffffffffffffff60(%rbp)
13: 0f 29 8d 70 ff ff ff movaps %xmm1,0xffffffffffffff70(%rbp)
1a: 0f 29 55 80 movaps %xmm2,0xffffffffffffff80(%rbp)
1e: 0f 29 5d 90 movaps %xmm3,0xffffffffffffff90(%rbp)
22: 0f 29 65 a0 movaps %xmm4,0xffffffffffffffa0(%rbp)
26: 0f 29 6d b0 movaps %xmm5,0xffffffffffffffb0(%rbp)
2a: 0f 29 75 c0 movaps %xmm6,0xffffffffffffffc0(%rbp)
2e: 0f 29 7d d0 movaps %xmm7,0xffffffffffffffd0(%rbp)
32: 4c 89 8d 58 ff ff ff mov %r9,0xffffffffffffff58(%rbp)
39: 4c 89 85 50 ff ff ff mov %r8,0xffffffffffffff50(%rbp)
40: 48 89 8d 48 ff ff ff mov %rcx,0xffffffffffffff48(%rbp)
47: 48 89 95 40 ff ff ff mov %rdx,0xffffffffffffff40(%rbp)
4e: 48 89 b5 38 ff ff ff mov %rsi,0xffffffffffffff38(%rbp)
55: 48 8d 85 30 ff ff ff lea 0xffffffffffffff30(%rbp),%rax
5c: 48 89 45 f8 mov %rax,0xfffffffffffffff8(%rbp)
60: 48 8d 45 10 lea 0x10(%rbp),%rax
64: 48 89 45 f0 mov %rax,0xfffffffffffffff0(%rbp)
68: c7 45 ec 30 00 00 00 movl $0x30,0xffffffffffffffec(%rbp)
6f: c7 45 e8 08 00 00 00 movl $0x8,0xffffffffffffffe8(%rbp)
76: 48 63 45 e8 movslq 0xffffffffffffffe8(%rbp),%rax
7a: 48 83 f8 28 cmp $0x28,%rax
7e: 77 0f ja 8f <foo+0x8f>
80: 48 89 c1 mov %rax,%rcx
83: 48 03 4d f8 add 0xfffffffffffffff8(%rbp),%rcx
87: 83 c0 08 add $0x8,%eax
8a: 89 45 e8 mov %eax,0xffffffffffffffe8(%rbp)
8d: eb 0c jmp 9b <foo+0x9b>
8f: 48 8b 4d f0 mov 0xfffffffffffffff0(%rbp),%rcx
93: 48 8d 41 08 lea 0x8(%rcx),%rax
97: 48 89 45 f0 mov %rax,0xfffffffffffffff0(%rbp)
9b: 8b 01 mov (%rcx),%eax
9d: 48 83 c4 50 add $0x50,%rsp
a1: 5d pop %rbp
a2: c3 retq