plasma-disassembler / plasma

Plasma is an interactive disassembler for x86/ARM/MIPS. It can generates indented pseudo-code with colored syntax.
GNU General Public License v3.0
3.05k stars 275 forks source link

Loops wrongly marked as "false loops", deleting nodes from the AST #80

Closed jspam closed 7 years ago

jspam commented 7 years ago

I discovered this during disassembly of an existing binary, but have been unsuccessful in producing C code that compiles to a program triggering the bug. Here is a piece of code that almost triggers it (all code is in https://github.com/jspam/plasma/tree/loop-issue/tests):

nestedloop7.c

#include <stdio.h>
#include <stdlib.h>

int main() {
    int i, j, k, l, m, n, o;

    while (i != 0) {
        if (j == 0) {
            if (k == 0) {
                while (l < 0) {
                    printf("4\n");
                }
            }
            printf("5\n");
        } else {
            printf("6\n");
            do {
              if (m < 0) {
                goto loop2end;
              }
            } while (n <= 0);

            while(o > 0) {
              printf("7\n");
loop2end:
              ;
            }

            printf("8\n");
        }
    }

    return 0;
}

This disassembles fine (nestedloop7-orig.bin, compiled with gcc 7.1.1), but when making the jump at 0x400549 go directly to the loop header (change the byte at offset 0x54A from 0x14 to 0x15, yielding nestedloop7.bin), as shown in the picture

image

the disassembly becomes:

function main (.text) {
    0x400507: push rbp
    0x400508: rbp = rsp
    0x40050b: rsp -= 32
    0x40050f: jmp 0x400570
}

Further investigation shows that the loop starting at 0x400560 is detected, however it is immediately discarded because block 0x400560 is still waiting for predecessor block 0x400551 which is not contained in the loop.

As a result, the loop detection wrongly marks some other blocks as loop headers. The false loop detection then marks many of these false loops, but also the real loop starting at 0x400570, which therefore does not show up in the disassembly.

plasma-disassembler commented 7 years ago

Thank you very much, I will try to investigate when I will have time.

plasma-disassembler commented 7 years ago

I've reproduced the bug with a statement after the label. I've also a bit simplified it by removing the if k == 0) :

#include <stdio.h>
#include <stdlib.h>

int main() {
    int i, j, k, l, m, n, o;

    while (i != 0) {
        if (j == 0) {
            while (l < 0) {
                printf("4\n");
            }
            printf("5\n");
        } else {
            printf("6\n");
            do {
                if (m < 0) {
                    goto loop2end;
                }
            } while (n <= 0);

            while(o > 0) {
                printf("7\n");
loop2end:
                printf("9\n");
            }

            printf("8\n");
        }
    }

    return 0;
}
plasma-disassembler commented 7 years ago

Hi, here are your binaries :

nestedloop7-orig.bin -> tests/nestedloop7.bin nestedloop7.bin -> tests/gotoinloop20_2_nestedloop7.bin

My example in the previous comment is : tests/gotoinloop20.{bin, c}

Thanks

jspam commented 7 years ago

The binary on which I originally discovered the bug is now being disassembled correctly. Many thanks for the quick fix!