JonathanSalwan / Triton

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.
https://triton-library.github.io
Apache License 2.0
3.52k stars 538 forks source link

How to cope with TypeError: std::bad_alloc ? #359

Closed gitttt closed 8 years ago

gitttt commented 8 years ago

This is the line causing it:

models = getModel(assert_(land(previousConstraints, branch['constraint'])))

Is this something one can avoid? Is it a bug in the Triton CPP code or in Z3? The exception is thrown while waiting for the above line to complete. It takes approximately 20 minutes or so before the exceptions occurs.

I have 12gig of RAM.

JonathanSalwan commented 8 years ago

Is this something one can avoid? Is it a bug in the Triton CPP code or in Z3?

Probably in Z3. Maybe your expression si too complex. Try with the optimization ALIGNED_MEMORY enabled.

enableSymbolicOptimization(OPTIMIZATION.ALIGNED_MEMORY, True)
gitttt commented 8 years ago

I already have that optimization set. The target program simply does a strcmp(), nothing more. The trace contains ~10.000 instructions. Did you ever run into this problem? Does it indicate that I am using the Triton API in the wrong way? How can I avoid this?

JonathanSalwan commented 8 years ago

Does it indicate that I am using the Triton API in the wrong way? How can I avoid this?

I can't tell like that. Show me the code.

gitttt commented 8 years ago

Here is the relevant (shortened) code, it is a minimal adaptation of your example:

for entry in dict_trace['trace_entries']:
    inst = Instruction()
    inst.setOpcodes(entry['opcode'])
    inst.setAddress(entry['instruction_address'])

    if 'register_writes' in entry:
        for reg_write in entry['register_writes']:
            regs[reg_write['reg_name']] = reg_write['reg_value']

    inst.updateContext(Register(REG.RAX,    regs['rax']))
    inst.updateContext(Register(REG.RBX,    regs['rbx']))
    #...
    inst.updateContext(Register(REG.EFLAGS, regs['eflags']))

    if 'memory_accesses' in entry:
        for mem_acc in entry['memory_accesses']:
            if mem_acc['memory_access_type'] == 'R':
                address = mem_acc['memory_address']
                size    = mem_acc['memory_access_size']
                value   = mem_acc['memory_access_value']
                inst.updateContext(Memory(address, size, value))
    # same for write accesses

    processing(inst)
def get_new_input_data():
    list_of_new_input_data = list()

    # Get path constraints from the last execution
    pco = getPathConstraints()
    print "The concolically (re)executed code contained a total of %d jumps and branches" % len(pco)

    # We start with any input. T (Top)
    previousConstraints = equal(bvtrue(), bvtrue())

    # Go through the path constraints
    for pc in pco:
        #print "getTakenAddress(): 0x%x" % pc.getTakenAddress()
        # If there is a condition
        if pc.isMultipleBranches():
            # Get all (two ?) branches
            branches = pc.getBranchConstraints()
            for branch in branches:
                # Get the constraint of the branch which has been not taken
                if branch['taken'] == False:
                    print "Analyzing branch at address 0x%x" % branch['target']
                    # TODO: this is not the location of the branch, right? It is the target of the branch.
                    # Ask for a model
                    print "Sending query to solver and waiting for response ..."
                    start = timeit.default_timer()
                    try:
                        models = getModel(assert_(land(previousConstraints, branch['constraint'])))
                    except Exception as e:
                        print e
                        end = timeit.default_timer()
                        print "Solver error after %f seconds." % (end - start)
                        sys.exit(1)

I initially did not add any code to my question because I thought it could be obvious that something is wrong with my code (and not Z3 running out of memory.).

I thought maybe a trace with "only" 10.000 instructions should not cause any such problems except if you are using the Triton API in a wrong way.

JonathanSalwan commented 8 years ago

I thought maybe a trace with "only" 10.000 instructions should not cause any such problems except if you are using the Triton API in a wrong way.

  • How long are the traces you normally process without any memory exhaustion problems?

The number of instruction doesn't matter. You can timeout with 50 instructions and get a model with 100K instructions... It really depend of the complexity of your expression. Based on a quick view, your code looks good. Can you print the expression before getting the model please?

  • Given the most recent changes, I cannot set enableSymbolicEmulation(False) any more. Can this be the cause? I guess not because I had these bad_alloc problem also before this change (although maybe less quickly (?) ).

Your error should not be related to the new updates, the exception is raised by Z3.

gitttt commented 8 years ago

I guess print getFullAst(branch['constraint']) is the right command, right?

Using getFullAst() The expression is huge! Too huge to print it completely.

Without getFullAst, branch['constraint'] is: (not (= (ite (= ((_ extract 0 0) ref!43912) (_ bv1 1)) (_ bv4561280 64) (_ bv4561422 64)) (_ bv4561422 64)))

There is something strange going on here, the program is really not complex (very similar to your example):


#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define MATCH 1
#define NO_MATCH 0

char *plaintext_debug_password = "123456789";

// Insecure: all strings starting with 123456789 are valid passwords
int check_byte_by_byte(char *entered_password)
{
    size_t i = 0;
    while (i < 9)
    {
        if (entered_password[i] != plaintext_debug_password[i])
        {
            return NO_MATCH;    
        }
        i++;
    }
    return MATCH;
}

int main(int argc, char** argv)
{   

    char *plaintext_password = argv[1];

    if (check_byte_by_byte(plaintext_password) == MATCH)
    {
        printf("Access granted.\n");
        return 0;
    }
    else
    {
        printf("Wrong password.\n");
        return 0;
    }
}

Compiled with: gcc -static triton_toy_example_byte_compare.c -o triton_toy_example_byte_compare_statically_linked

Started with: ./triton_toy_example_byte_compare_statically_linked asdf

Trace created with PIN, tracing everything (i.e. not using any filters or no_shared_libs).

If, however, I compile a dynamically linked version and use the no_shared_libs, then it runs very fast and all input data is successfully generated (after ~45 secs).

JonathanSalwan commented 8 years ago

I guess print getFullAst(branch['constraint']) is the right command, right?

Yes

Using getFullAst() The expression is huge! Too huge to print it completely.

That's why you timeout.

    if 'memory_accesses' in entry:
        for mem_acc in entry['memory_accesses']:
            if mem_acc['memory_access_type'] == 'R':
                address = mem_acc['memory_address']
                size    = mem_acc['memory_access_size']
                value   = mem_acc['memory_access_value']
                inst.updateContext(Memory(address, size, value))
    # same for write accesses

Can you try without update the write access context? I will close this issue because it's not related to the Triton project. However, feel free to continue to write in this thread.