windelbouwman / ppci

A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python
https://ppci.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
337 stars 36 forks source link

Translating to ppci assembly #122

Closed darleybarreto closed 3 years ago

darleybarreto commented 3 years ago

Hi, I am not a system programmer so I have really limited skills on this subject, specially when comes to assembly. I am trying to make x86_64 musl longjmp work on ppci, this is the implementation:

.global longjmp
longjmp:
    xor %eax,%eax
    cmp $1,%esi             /* CF = val ? 0 : 1 */
    adc %esi,%eax           /* eax = val + !val */
    mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
    mov 8(%rdi),%rbp
    mov 16(%rdi),%r12
    mov 24(%rdi),%r13
    mov 32(%rdi),%r14
    mov 40(%rdi),%r15
    mov 48(%rdi),%rsp
    jmp *56(%rdi)           /* goto saved address without altering rsp */

Is there a guide to the asm that ppci reads? I know some things, but from simple examples. The above snippet should be something like the following:

global longjmp
longjmp:
    xor eax,eax
    cmp #1,esi             ; CF = val ? 0 : 1
    adc esi,eax           ; eax = val + !val
    mov (rdi),rbx         ; rdi is the jmp_buf, restore regs from it
    mov 8(rdi),rbp
    mov 16(rdi),r12
    mov 24(rdi),r13
    mov 32(rdi),r14
    mov 40(rdi),r15
    mov 48(rdi),rsp
    jmp *56(rdi)           ; goto saved address without altering rsp
pfalcon commented 3 years ago

There're 2 main assembly syntaxes for x86 - "Intel" and "AT&T". Clearly, PPCI follows "Intel" syntax, you can google up differences between the two.

pfalcon commented 3 years ago

In regard to classic "Intel" syntax, PPCI clearly lacks (in)famous byte ptr and friends (PPCI never generates memory/immediate arg instructions?) and uses [rbp, 8] instead of [rbp + 8] for indirect memory access.

Don't forget -S switch to ppci-cc.

windelbouwman commented 3 years ago

I am not a system programmer

Not yet, you will be soon! :stuck_out_tongue_closed_eyes:

The assembly syntax is sort of intel style, as mentioned. It would be a cool addition to be able to support more than one syntax, but this would probably be a big rework of the current system.

darleybarreto commented 3 years ago

So should it be something like this?

global longjmp
longjmp:
    xor eax, eax
    cmp esi, 1             ; CF = val ? 0 : 1 
    adc eax, esi           ; eax = val + !val   
    mov rbx, [rdi]         ; rdi is the jmp_buf, restore regs from it 
    mov rbp, [rdi,8]
    mov r12, [rdi,16]
    mov r13, [rdi,24]
    mov r14, [rdi,32]
    mov r15, [rdi,40]
    mov rsp, [rdi,48]
    jmp [rdi,56]           ; goto saved address without altering rsp 
darleybarreto commented 3 years ago

So, I defined the jmp_buf as:

typedef unsigned long jmp_buf[8];

Following this implementation, setjmp is

global setjmp
setjmp:
    mov [rdi], rbx          ; rdi is jmp_buf, move registers onto it 
    mov [rdi,8], rbp
    mov [rdi,16], r12
    mov [rdi,24], r13
    mov [rdi,32], r14
    mov [rdi,40], r15
    lea rdx, [rsp,8]        ; this is our rsp WITHOUT current ret addr 
    mov [rdi,48], rdx
    mov rdx, [rsp]          ; save return addr ptr for new rip 
    mov [rdi,56], rdx
    xor rax,rax             ; always return 0 
    ret

and longjmp:

global longjmp
longjmp:
    mov rax, rsi            ; val will be longjmp return 
    test rax,rax
    jne __longjmp
    inc rax                 ; if val==0, val=1 per longjmp semantics 
__longjmp:
    mov rbx, [rdi]          ; rdi is the jmp_buf, restore regs from it 
    mov rbp, [rdi,8]
    mov r12, [rdi,16]
    mov r13, [rdi,24]
    mov r14, [rdi,32]
    mov r15, [rdi,40]
    mov rsp, [rdi,48]       ; this ends up being the stack pointer 
    mov rsp, rdx
    mov rdx, [rdi,56]       ; this is the instruction pointer 
    jmp [rdx]               ; goto saved address without altering rsp

setjmp seems to work fine, but longjmp is seg faulting in this code:

#include <stdlib.h>
#include <stdio.h>
#include <setjmp.h>

static jmp_buf jmp_buffer;

enum { OK, ERR_DIV_BY_ZERO };

int div(int a, int b) {
    if(b==0) {
        longjmp(jmp_buffer, ERR_DIV_BY_ZERO);
    }

    return a/b;
}

void main() {
    int result, error;
    error = setjmp(jmp_buffer);

    if(OK==error) { /* try */
        result = div(1, 0);
    }
    else { /* catch */
        printf("Error in div().");
        exit(1);
    }

    printf("Result: %d\n", result);
    exit(0);
}

The difference between the original implementation is that I use jne __longjmp instead of jnz __longjmp, which should be the same.

darleybarreto commented 3 years ago

Replacing the previous longjmp asm with the following from newlibc does the trick

global longjmp
longjmp:
    mov rax, rsi            ; val will be longjmp return 
    mov rbp, [rdi,8]
    mov rsp, [rdi,48]       ; this ends up being the stack pointer 
    mov rbx, [rdi,56]
    push rbx
    mov rbx, [rdi,0]          ; rdi is the jmp_buf, restore regs from it 
    mov r12, [rdi,16]
    mov r13, [rdi,24]
    mov r14, [rdi,32]
    mov r15, [rdi,40]
    ret
windelbouwman commented 3 years ago

Well played, this is a tricky thing involving stack layout, so this is depending upon register saved variables. Glad you got it to function!