atos-tools / qemu

QEMU with instrumentation support, ref to tcg/plugins/README
https://guillon.github.io/qemu-plugins
Other
5 stars 8 forks source link

Is the "on_block_transition/executed" in cpp plugin executing after a block is executed? #12

Open aar0nge opened 5 years ago

aar0nge commented 5 years ago

I see that the comments said it is "called after block was executed" and not quite understand. It seems is called in "before_gen_tb", which is executed before a block? Is it because it uses "tcg_gen_callN" to generate code in on_block_executed so it waits until a block is executed?

I also don't understand the differences between on_block_transition and on_block_executed. Can someone help me please ?

I'm trying to implement a hook on functions like malloc using this plugin, so I think I should use "pre_tb_helper_code", right? Thanks for your help !

second-reality commented 5 years ago

Hello aar0nge,

you saw the right comment. Idea of CPP plugin interface is to offer an abstract API, independent of QEMU internals (I ported it (partly) to DynamoRIO for instance). Thus, using it, you should not have to use any QEMU function.

QEMU plugin API offers a way to put hooks when a block is translated/executed.

CPP Plugin is offering an interface to put hooks when a block is executed only (translation could be added if needed). Since it gives you information about how the block was reached, easy access to its instructions and statistics during its execution (memory accesses mainly), we report block execution AFTER it was executed (and before new one is executed). Code generated can't be modified (we offer a "read-only" view of the execution).

If you want to use CPP Plugin, you do not have to deal with QEMU interface like "pre_tb_helper_code". Just inherit Interface and put your code here (see this plugin for example https://github.com/atos-tools/qemu/blob/next/master/tcg/plugins/cpp/count_instructions.cpp).

If you want QEMU "pure" plugins, then see documentation for plugins (https://github.com/atos-tools/qemu/blob/next/master/tcg/plugins/dyncount.c for example).

If you want to implement hook for malloc with any method, I would go with detecting first execution of a block inside malloc function, then put your stuff there. If you want to modify behavior of malloc, you'll have to go with QEMU plugins (CPP plugin does not modify generated code). If you only want to monitor number of calls to malloc, then CPP Plugin should be able to get you there.

CPP Plugin: I would implement on_block_transition, detect if transition is a call and if next block function is malloc. Should be less than 10 lines.

Hope it helps!

second-reality commented 5 years ago

To go further,

on_block_transition/on_block_executed could have been implemented in the same function (they are called one after the other in plugin_api.cpp).

The point is that a lot of plugin only need to have access to block executed, without needing to know what happens between blocks (like counting instructions).

Thus, on_block_transition was added for the need to analyze this. It was just a design choice (I wanted the simpliest API possible for a simple plugin like counting instructions).

For your need, you could as well implement your malloc hook by keeping track of current function executed (with a pointer) and check if it changed between two blocks (using on_block_executed).

aar0nge commented 5 years ago

Thanks for your reply ! Indeed I want to modify malloc's behavior, so I may go with qemu plugin "pre_tb_helper_code".

I understand that CPP plugin is an abstract API, and I am just trying to understand your implementation of it.
So I saw that "on_block_executed" is implemented mainly by: "before_gen_tb"-->"tcg_gen_callN"-->"on_block_exec"-->"event_block_enter"-->"block_was_executed"-->"on_block_executed", is that correct? Since it starts with "before_gen_tb", shouldn't it be executed before a block is executed?

second-reality commented 5 years ago

Ok! better for you to go with QEMU plugins in this case.

To explain, as you saw, interesting code for you is there (https://github.com/second-reality/qemu/blob/next/master/tcg/plugins/cpp.c).

I'll detail code.


static void before_gen_tb(const TCGPluginInterface* tpi)
{
    current_block_ptr = malloc(sizeof(translation_block*));

    TCGv_ptr t_block = tcg_const_ptr(current_block_ptr);

    TCGTemp *args[] = {tcgv_ptr_temp(t_block)};
    tcg_gen_callN(on_block_exec, TCG_CALL_DUMMY_ARG, 1, args);

    tcg_temp_free_ptr(t_block);
}

This function generates a call at the beginning of a block executed (on_block_exec), thus CPP interface is notified each time a block is started. (1) This call will have a parameter that is allocated dynamically once per block and represent it in the CPP API.

At this time of generation, some information are not known (like size of block, ...), thus second function after generation:


static void after_gen_tb(const TCGPluginInterface* tpi)
{
    /* tb size is only available after tb generation */
    const TranslationBlock* tb = tpi->tb;
    uint64_t pc = tb->pc;
    const uint8_t* code = (const uint8_t*)tpi_guest_ptr(tpi, pc);

    const char* file = NULL;
    uint64_t load_address = 0;

    get_mapped_file(pc, &file, &load_address);

    translation_block* block =
        get_translation_block(pc, code, tb->size, file, load_address);
    /* patch current_block ptr */
    *current_block_ptr = block;
}

It will patch parameter of generated call before (see 1 above), so that correct block is obtained.

The point is that "get_translation_block" will create view of block (disassemble it, record it, ...), so we need block to be fully emitted before doing all this. Thus the mechanics between before and after gen. That is the kind of things that made me develop a CPP API for my needs.

I'm pretty sure you know about it, but probably you are a bit confused between translation and execution time.

What we do in QEMU plugins is patching at Translation time. CPP API offers a view at execution time (that will have a fixed parameter representing current block called at beginning of any of them that is obtained... at translation time).

aar0nge commented 5 years ago

Oh, now I know. Thanks for your detailed explaination! It really helps.