polyglot-compiler / JLang

JLang: Ahead-of-time compilation of Java programs to LLVM
http://polyglot-compiler.github.io/JLang/
Other
287 stars 30 forks source link

Handle Java null pointers (and divide by zero, etc.) #4

Open dz333 opened 6 years ago

dz333 commented 6 years ago

From @gharrma on February 23, 2017 4:23

Options:

Copied from original issue: gharrma/polyllvm#20

dz333 commented 6 years ago

From @andrewcmyers on March 9, 2017 21:38

The interrupt approach would require modifying the ucontext structure in the signal handler to make the pc go to an exception generating code segment. This sounds doable though I don't know how nicely it plays with LLVM.

dz333 commented 6 years ago

From @andrewcmyers on March 10, 2017 0:19

Here is some C code that recovers from SIGSEGV (sort of). From LLVM code you can do better because you know the address of code. In this example we 'recover' into a function that probably has a different stack layout from main, leading to a second segmentation violation.

#include <stdio.h>
#include <signal.h>

int sawit = 0;
ucontext_t saved_ucontext;

extern int recover();
extern void action();

int main(int argc, char **argv) {
    int *x = (int *)0;

    struct sigaction sa;
    sa.sa_sigaction = action;
    sa.sa_mask = 0;
    sa.sa_flags =  SA_SIGINFO;

    sigaction(SIGSEGV, &sa, 0);

    printf("Assigning...\n");

    int y = *x;
}

void action(int sig, siginfo_t *info, void *ucontext) {
    sawit = 1;
    ucontext_t *u =  (ucontext_t *)ucontext;
    saved_ucontext = *u;
    u->uc_mcontext->__ss.__rip = (unsigned long long)&recover; // XXX machine-dependent
}

int recover() {
    printf("Recovered from SIGSEGV: sawit = %d\n", sawit);
    return 0;
}
dz333 commented 6 years ago

From @andrewcmyers on March 10, 2017 0:24

Note that the actual signal handler generated would need to map from the current pc (__rip) to the desired pc for handling the exception. Maybe there is a way to reuse the existing exception machinery for this?

dz333 commented 6 years ago

From @gharrma on March 10, 2017 1:28

Interesting code snippet! Something like that might work, though my understanding isn't deep enough yet to know how to get from the signal handler to a NullPointerException that can be caught successfully. (Is it possible to just recover into a function (compiled from Java) that throws the NullPointerException explicitly?)

In case it's useful, here's an example I made where recovering to main succeeds.

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <setjmp.h>

jmp_buf recover;

void handler(int sig, siginfo_t *info, void *ucontext) {
    printf("Handling SIGSEGV\n");
    longjmp(recover, 1);
}

int main(int argc, char **argv) {
    struct sigaction sa;
    sa.sa_sigaction = handler;
    sa.sa_mask = 0;
    sa.sa_flags =  SA_SIGINFO;
    sigaction(SIGSEGV, &sa, 0);

    if (setjmp(recover) != 0) {
        printf("Recovered to main!\n");
        return 0;
    }

    printf("Assigning...\n");
    int *x = (int*) 0;
    int y = *x;
}
dz333 commented 6 years ago

From @andrewcmyers on March 10, 2017 1:52

setjmp/longjmp will be too expensive to use. I think you just want to reset the pc as you return to another label within the same function, then throw the null pointer exception as you normally would. I guess there may be some issues with restoring variables into registers. But the exception mechanism should have exactly the same issues.

dz333 commented 6 years ago

From @andrewcmyers on April 25, 2017 15:45

Did we figure out how to do this?

dz333 commented 6 years ago

From @gharrma on April 25, 2017 22:18

Unfortunately not--I set aside NullPointerException for a bit while working on other parts. Should we implement the simple approach first (a check on each pointer access)?

dz333 commented 6 years ago

From @andrewcmyers on April 25, 2017 22:22

That make sense as a starting point. Probably we would want to make that an option in any case since the alternatives are not machine-independent or break Java semantics unacceptably.

dz333 commented 6 years ago

From @gharrma on February 12, 2018 2:23

Here's an update on what I know after reading about this more in my free time:

dz333 commented 6 years ago

From @gharrma on May 24, 2018 17:40

Update: we still do not check for NullPointerExceptions, although we do print a nice message after segfaults and divide-by-zero.

If we are ok with the performance hit (which is likely to be small), it would be relatively easy to add null pointer checks in emitted code. In order to insert checks before field accesses, see ObjectStruct_c#buildFieldElementPtr. For method calls, see PolyLLVMCallExt#buildFuncPtr