ampotos / dynStruct

Reverse engineering tool for automatic structure recovering and memory use analysis based on DynamoRIO and Capstone
MIT License
316 stars 34 forks source link

Compile error with Visual studio #23

Closed fimmugit closed 7 years ago

fimmugit commented 7 years ago

I tried to compile Dynstruct from source with VS2015 and I got an error in sym.c lines 9-11:

bool sym_to_hashmap(drsym_info_t info, drsym_error_t attribute((unused))status, void data)

The error is missing ")" before "{"

I am not sure what to do with that. Could you look at it? Thanks

ampotos commented 7 years ago

This is due to the __attribute__ ((unused)) which is supported by Clang and GCC only (as far as I know). I didn't try to build it on windows. A solution to fix that can to replace every __attribute__ ((unused)) by a macro which will be define to __attribute__ ((unused)) when GCC or Clang is used and do nothing for other compiler.

If you are trying to use dynStrust for windows right now you will have some issue (windows is not yet supported https://github.com/ampotos/dynStruct/issues/19).

To port it to windows there is not a lot of work to do except rewritting a equivalant of elf.c for pe format. I don't have the time to do the porting myself these days (but I still want to do it one day).

If you want to do the macro and/or the porting PR are welcomed (and I'm available for any question).

fimmugit commented 7 years ago

Hi, Ampotos: Thanks for getting back to me. I did a quick search and found a way to disable attribute by:#define attribute(A)  but then I got other errors complaining about unknown size void pointers were used. For instance, in allocs.c

line28:  tmp_pc = dr_app_pc_for_decoding(pc - ct);line276:  block->end = block->start + block->size; Any idea how to fix this?  I have a project which needs to identify an unknown data structure. In your opinion, is dynstruct an tool suitable for this? If so, I could try my best to port it to Windows although I am not good at this. Best regards,

On Monday, April 3, 2017 4:18 AM, ampotos <notifications@github.com> wrote:

This is due to the attribute ((unused)) which is supported by Clang and GCC only (as far as I know). I didn't try to build it on windows. A solution to fix that can to replace every attribute ((unused)) by a macro which will be define to attribute ((unused)) when GCC or Clang is used and do nothing for other compiler.If you are trying to use dynStrust for windows right now you will have some issue (windows is not yet supported #19).To port it to windows there is not a lot of work to do except rewritting a equivalant of elf.c for pe format. I don't have the time to do the porting myself these days (but I still want to do it one day).If you want to do the macro and/or the porting PR are welcomed (and I'm available for any question).— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ampotos commented 7 years ago

For what I saw on google vs2015 cannot do pointer arithmetic if the pointer is of the type void. Casting the pointers to char * should do it. tmp_pc = dr_app_pc_for_decoding((char *)pc - ct) block->end = (char *)block->start + block->size

dynStruct can help you identify an unknown structure by telling the size of the different members and where there are read/written by the analysed program. Something which may be simpler (depending what is your project) is to just write a json file with the same structure as the one produce by the data gatherer and use dynStruct.py for the recovery of the structutre + the web interface if needed.

If writting this json is not possible for your project, porting dynStruct to windows can do it. If you go for it we should keep in touch in IRC or equivalant.

fimmugit commented 7 years ago

Hi, Ampotos: I will look into your fix for the VS build. Meanwhile, I tried the MinGW way. After the initial cmake, first I tried to run mingw32-make all within the dos console and I stepped into the include path problem. Then I tried to run it again within Mintty and I was able to get rid of those include issues. Now I run into some errors associated with the elf.c file as you mentioned earlier: C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:95:1: error: unknown type name 'mod                                                    ule_segment_data_t' module_segment_data_t find_load_section(const module_data_t mod, ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c: In function 'find_load_section':C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:132:39: error: 'module_data_t {aka                                                     const struct _module_data_t}' has no member named 'num_segments'   for (uint idx_seg = 0; idx_seg < mod->num_segments; idx_seg++)                                       ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:134:22: error: 'module_data_t {aka                                                     const struct _module_data_t}' has no member named 'segments'       if ((size_t)mod->segments[idx_seg].end -                      ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:135:15: error: 'module_data_t {aka                                                     const struct _module_data_t}' has no member named 'segments'    (size_t)mod->segments[idx_seg].start == tmp_data->size_seg &&               ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:136:8: error: 'module_data_t {aka c                                                    onst struct _module_data_t}' has no member named 'segments'    (mod->segments[idx_seg].prot == tmp_data->seg_perm))        ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:137:12: error: 'module_data_t {aka                                                     const struct _module_data_t}' has no member named 'segments'  return mod->segments + idx_seg;            ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c: In function 'add_plt':C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:171:3: error: unknown type name 'mo                                                    dule_segment_data_t'   module_segment_data_t seg_plt;   ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:186:31: error: request for member '                                                    start' in something not a structure or union   new_node->min_addr = seg_plt->start + tmp_data_plt.sect_offset;                               ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c: In function 'remove_plt':C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:212:3: error: unknown type name 'mo                                                    dule_segment_data_t'   module_segment_data_t seg;   ^C:\SystemTools\DynamoRIO\dynStruct\src\elf.c:217:31: error: request for member '                                                    start' in something not a structure or union   del_from_tree(&plt_tree, seg->start + tmp_data.sect_offset, NULL, true);                               ^CMakeFiles\dynstruct.dir\build.make:212: recipe for target 'CMakeFiles/dynstruct                                                    .dir/src/elf.c.obj' failedmingw32-make[2]: [CMakeFiles/dynstruct.dir/src/elf.c.obj] Error 1CMakeFiles\Makefile2:66: recipe for target 'CMakeFiles/dynstruct.dir/all' failedmingw32-make[1]: [CMakeFiles/dynstruct.dir/all] Error 2Makefile:150: recipe for target 'all' failedmingw32-make: *** [all] Error 2

Your advice is greatly appreciated!

On Monday, April 3, 2017 11:31 AM, ampotos <notifications@github.com> wrote:

For what I saw on google vs2015 cannot do pointer arithmetic if the pointer is of the type void. Casting the pointers to char should do it. tmp_pc = dr_app_pc_for_decoding((char )pc - ct) block->end = (char *)block->start + block->sizedynStruct can help you identify an unknown structure by telling the size of the different members and where there are read/written by the analysed program. Something which may be simpler (depending what is your project) is to just write a json file with the same structure as the one produce by the data gatherer and use dynStruct.py for the recovery of the structutre + the web interface if needed.If writting this json is not possible for your project, porting dynStruct to windows can do it. If you go for it we should keep in touch in IRC or equivalant.— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ampotos commented 7 years ago

The purpose of elf.c is to parse the elf executable and use to find to find the address of the GOT and of the PLT to be able to resolve library call. So this file can work only for elf executable, if you plan to build dynstruct for windows executable (it look like it is your plan) we need to make a pe.c file and use some #ifdef WINDOWS to be multi-os.

The error you have are on dynamorio internal structure which are not exactly the same on different OS. The part of these structure describe in the documentation are almost identical but if you look at the soruce of dynamorio they have more specific informations (dr_tools.h:1398 for the example of one of the error you had). I had to use some linux only field to find the address of the loaded sections of the executable.

The only place where the function of elf.c are used are at dynstruct.c:143. This is putting the addr of the plt of every loaded executable and library in a binary tree which will be check at every wrapped call (look at the function get_caller_data in call.c).

So a good plan to handle pe file is to write a pe.c for all pe specific stuff. Add the needed change in get_caller_data inside #ifdef WINDOWS block.

What do you think ?

fimmugit commented 7 years ago

What you suggest if to build a pe.c to replace elf.c and add some if else to toggle between different OS, right? Can you explain a bit the logic and workflow behind elf.c so that I can better understand what to do for Windows?  I am a total outsider for PE and ELF. I just started my learning of what they are and hope to see if I can make it through. Hope you can give me a hand along the way. You mentioned about json. Do you have an example json format output of data gatherer? I just want to know to put in it. Thanks for your quick response! Have a good evening

On Monday, April 3, 2017 4:51 PM, ampotos <notifications@github.com> wrote:

The purpose of elf.c is to parse the elf executable and use to find to find the address of the GOT and of the PLT to be able to resolve library call. So this file can work only for elf executable, if you plan to build dynstruct for windows executable (it look like it is your plan) we need to make a pe.c file and use some #ifdef WINDOWS to be multi-os.The error you have are on dynamorio internal structure which are not exactly the same on different OS. The part of these structure describe in the documentation are almost identical but if you look at the soruce of dynamorio they have more specific informations (dr_tools.h:1398 for the example of one of the error you had). I had to use some linux only field to find the address of the loaded sections of the executable.The only place where the function of elf.c are used are at dynstruct.c:143. This is putting the addr of the plt of every loaded executable and library in a binary tree which will be check at every wrapped call (look at the function get_caller_data in call.c).So a good plan to handle pe file is to write a pe.c for all pe specific stuff. Add the needed change in get_caller_data inside #ifdef WINDOWS block.What do you think ?— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ampotos commented 7 years ago

That's exactly what I'm suggesting. The needed of elf.c is to have the real addres (and symbol) of the called function in the case of a library call. For how the elf format work (and why I spoke about got and plt section) http://www.cs.stevens.edu/~jschauma/810/elf.html.

The PE format work in a total different way. They use only 1 section called import table which contain the address of the target function. http://resources.infosecinstitute.com/the-import-directory-part-1/ Il faut donc parser la l'executable pour trouver l'addresse de cette import table et la lire en cas de call vers une library pour pouvoir récupérer l'addresse de la fonction appeler.

For the json I'll do one and send it later.

ampotos commented 7 years ago

Maybe we can use this library (https://lief.quarkslab.com/) instead of the parsing I made. There will still be some specific code but using a library can help to have a uniform code.

I just saw this library quickly so I'm not 100% sure it cna do what dynStruct need but I think so.

fimmugit commented 7 years ago

Hi, Ampotos: I've got two of your mails today. The PE format seems to be more complex than ELF. I have to draw flow charts to get a better idea of how things work. The more I learn, the more questions I have. I will compile a list of questions for you and check the lib you mentioned in the second mail. Best regards,

On Tuesday, April 4, 2017 1:45 PM, ampotos <notifications@github.com> wrote:

Maybe we can use this library (https://lief.quarkslab.com/) instead of the parsing I made. There will still be some specific code but using a library can help to have a uniform code.I just saw this library quickly so I'm not 100% sure it cna do what dynStruct need but I think so.— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ampotos commented 7 years ago

If you want to have more detail on how work elf.c and why is it needed, I just add a link to my master thesis on the readme. The section 2.2.3 explain that in detail (p10-11)

fimmugit commented 7 years ago

After extensive study, I now have a better understanding about the PE internal. I will probably come up with a proposal shortly for you to evaluate before any coding.  Do you think that dynstruct/DynamoRIO framework has everything I need (like data structures/utilities other than windows specific ones) to support what I do?

On Wednesday, April 5, 2017 11:31 AM, ampotos <notifications@github.com> wrote:

If you want to have more detail on how work elf.c and why is it needed, I just add a link to my master thesis on the readme. The section 2.2.3 explain that in detail (p10-11)— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

fimmugit commented 7 years ago

What I intend to do is to identify an unknown structure used by a plugin which interacts with a remote data server and by sending a message and the data it received (this is the structure I like to find out) to the windows application to refresh GUI. I hope dynstruct can help me to achieve this goal.

On Wednesday, April 5, 2017 11:31 AM, ampotos <notifications@github.com> wrote:

If you want to have more detail on how work elf.c and why is it needed, I just add a link to my master thesis on the readme. The section 2.2.3 explain that in detail (p10-11)— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ampotos commented 7 years ago

For this specific usage dynStruct may not be your best choice (I thought you were speaking of an internal structure in your first post, not one coming form the network). dynStruct is made to allow automatique recovery of internal structure of a program by looking how the structure is access during the execution (and the data msut be store on the heap). If the data came from a network connection it depend how the software handle it. If it use it as a structure and store it on the heap, dynStruct can help in other cases dynStruct is useless.

fimmugit commented 7 years ago

Sorry to hear that. It's a bit disappointing for me. I am sure this tool does a wonderful job for what it intends to do. Hopefully, it can be ported to Windows in the future. Best,

On Thursday, April 6, 2017 4:35 AM, ampotos <notifications@github.com> wrote:

For this specific usage dynStruct may not be your best choice (I thought you were speaking of an internal structure in your first post, not one coming form the network). dynStruct is made to allow automatique recovery of internal structure of a program by looking how the structure is access during the execution (and the data msut be store on the heap). If the data came from a network connection it depend how the software handle it. If it use it as a structure and store it on the heap, dynStruct can help in other cases dynStruct is useless.— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ampotos commented 7 years ago

I hope to find the time to port it to windows. Thank you for considering to contribute if it was fitting your need.