shepmaster / cupid

Get information about the x86 and x86_64 processor
MIT License
34 stars 9 forks source link

Use C shims with inline assembly or compiler intrinsics #6

Closed shepmaster closed 7 years ago

shepmaster commented 7 years ago

Look Ma, no manual function prologues or epilogues!

shepmaster commented 7 years ago

@alexcrichton or @burntsushi, would either of you care to review? If not, that's totally fine; I'll merge in a day or two.

alexcrichton commented 7 years ago

Seems fine to me! I have no idea about inline assembly though so I'm not sure whether it's right or not.

I might recommend leaving a version in the symbol name, that way if you release a new major version of this lib then you can rename that symbol (on the major version) and it can coexist with another major version.

BurntSushi commented 7 years ago

This also looks OK to me, but like @alexcrichton, I don't actually know whether it's right or not. But it seems worth it to try it out since this does seem easier to maintain!

I like the version-in-symbol-name idea too. :-)

shepmaster commented 7 years ago

a version in the symbol name,

Oh, that's what that is! Forwards compatibility makes sense, so I'm going to go ahead and add the second parameter (ecx) to the signature now and add a version ID to the name (although a future release could also introduce one with the same result).

shepmaster commented 7 years ago

I don't actually know whether it's right or not I'm not sure whether it's right or not

Here's the original assembly for 64-bit macOS:

push %rbx
push %rsi
mov ARG2, %rsi
mov ARG1, %eax
xor %rcx, %rcx
cpuid
mov %eax, (%rsi)
mov %ebx, 4(%rsi)
mov %ecx, 8(%rsi)
mov %edx, 12(%rsi)
pop %rsi
pop %rbx
ret

And the disassembly of the C version:

push   %rbp
mov    %rsp,%rbp
push   %rbx
mov    %rdx,%r8
mov    %edi,%eax
mov    %esi,%ecx
cpuid
mov    %eax,(%r8)
mov    %ebx,0x4(%r8)
mov    %ecx,0x8(%r8)
mov    %edx,0xc(%r8)
pop    %rbx
pop    %rbp
retq

It really only has one more instruction to handle the second argument, but the scratch register also changed, so the diff looks bigger than it is:

--- old.asm 2017-03-16 09:43:54.000000000 -0400
+++ new.asm 2017-03-16 09:43:45.000000000 -0400
@@ -1,13 +1,14 @@
+push    %rbp
+mov     %rsp,%rbp
 push    %rbx
-push    %rsi
-mov     ARG2,%rsi
-mov     ARG1,%eax
-xor     %rcx,%rcx
+mov     %rdx,%r8
+mov     %edi,%eax
+mov     %esi,%ecx
 cpuid
-mov     %eax,(%rsi)
-mov     %ebx,4(%rsi)
-mov     %ecx,8(%rsi)
-mov     %edx,12(%rsi)
-pop     %rsi
+mov     %eax,(%r8)
+mov     %ebx,0x4(%r8)
+mov     %ecx,0x8(%r8)
+mov     %edx,0xc(%r8)
 pop     %rbx
-ret
+pop     %rbp
+retq
BurntSushi commented 7 years ago

@shepmaster I should say that I didn't know whether the Assembly version was right or not either. :-)

I basically have zero experience with any of this. I was prepared to learn how to do it, but @alexcrichton is too awesome and did it instead!

alexcrichton commented 7 years ago

Yeah looks right to me! I was basically lazy looking up calling conventions and decided to push %rsi instead of using a scratch register, but otherwise that assembly does frame pointer management and looks correct.