fiberx / fiber

Source-binary patch presence test system.
BSD 2-Clause "Simplified" License
80 stars 37 forks source link

There is a problem about tool/ext_sym in fiber. #6

Closed qiuguoping closed 5 years ago

qiuguoping commented 5 years ago

Hi cpumask, we found a problem about ext_sym, symbol table generated by tool/ext_sym is not consistent with system map generated by compiler for same keneral image, there is different address for same function, can you tell me if the case can result in patch matching unsuccessfully? in same cases, we will generate symbol table by ext_sym without available system map. we are looking forward to your reply, thank you.

cpumask commented 5 years ago

Hi, guoping,

The correctness of symbol table is important for fiber since we need to know the function addresses for both signature generation (from reference kernel) and matching (to target kernel), if the function address is wrong, then it's unlikely that fiber can still give correct results.

If we have a kernel image and extract its embedded symbol table using "tool/ext_sym", I believe that symbol table should be accurate since it's directly extracted from the image itself. The compiler generated "system.map" in most cases should be consistent with the embedded symbol table, but if not, I suggest that we use the symbol table extracted by "tool/ext_sym".

BTW. If it's possible, you can also let us know the kernel source code and compiler options you use to generate the "system.map" and the final kernel image, so that we may be able to do an investigation regarding the inconsistency. Thanks!

qiuguoping commented 5 years ago

the kernel source code we tested is belong to the code of product, It is difficult to supply outside, we can test open source kernel, and check if there is a similar problem, if so, we will tell you, Thank you.

cpumask commented 5 years ago

Ok, sounds great, thanks!

qiuguoping commented 5 years ago

we compare system.map with symbol map extracted by "tool/ext_sym", the index of their address differ one, for example as follow: system.map: ffffff8008080000 t _head ffffff8008080000 T _text ffffff8008081000 T __exception_text_start ffffff8008081000 T _stext ffffff8008081004 T do_undefinstr ffffff800808139c T do_sysinstr ffffff800808143c T do_mem_abort ffffff8008081518 T do_el0_irq_bp_hardening

symbol map extracted by "tool/ext_sym": 0xffffff8008080000 t _head 0xffffff8008081000 T _text 0xffffff8008081000 T _stext 0xffffff8008081004 T __exception_text_start 0xffffff800808139c T do_undefinstr 0xffffff800808143c T do_sysinstr 0xffffff8008081518 T do_mem_abort 0xffffff8008081530 T do_el0_irq_bp_hardening

how to solve it? we don't know which one is correct.

cpumask commented 5 years ago

Hi, guoping,

Thanks for the feedback. At this point I'm still not quite sure about why there will be a mismatch between system.map and the kernel image built-in symbol table (extracted by "ext_sym"), we may need a further investigation later. In this situation, I would suggest that let's just go with the symbol table extracted by "tool/ext_sym", it should be the correct one since it's directly embedded in the kernel image we want to analyze. Please let us know if you encounter any issues with this symbol table. Thanks!

qiuguoping commented 5 years ago

There are some logs on screen when extracting symbol table by tool/ext_sym: Image size: 40260920 Image base address in memory: 0x7f01653e2000 Locating the token_table address... token_table: 0x7f0166a50100 Locating other sections... It seems that kallsyms_addresses section is not good, try to do relocation... RELO_SEC 0: 0x7f0166d2f508 for off: 0xffffffff26001402- 0x7f0166d305e8 for off: 0xffffffff10a78086 RELO_SEC 1: 0x7f0167010098 for off: 0xffffff8008084000- 0x7f0167510690 for off: 0xffffff80099d2cd8 RELO_SEC 2: 0x7f0167510858 for off: 0xffffff8009c48dd0- 0x7f01675c8698 for off: 0xffffff8009a03968 m0 p:0x7f0164e28010 e:0x7f01653e1570 Below relo info is for kallsyms_addresses... relo_st:0x7f0164eee940 for off:0xffffff8009418308, relo_ed:0x7f01653e1570 for off:(nil) syms_names: 0x7f0166888200 syms_addrs: 0x13d4010 num_syms: 138176

qiuguoping commented 5 years ago

I hope that the above logs can be useful to locate the problem, Thanks.

cpumask commented 5 years ago

Hi, guoping,

Thanks for the sharing. From the "ext_sym" log, it seems the symbol table extraction process is successful. We can try this symbol table and see whether it can work.

qiuguoping commented 5 years ago

in above logs, "relo_st:0x7f0164eee940 for off:0xffffff8009418308, relo_ed:0x7f01653e1570 for off:(nil)", off:(nil) is normal?

cpumask commented 5 years ago

Yes, I think that's normal for the "tool/ext_sym".

qiuguoping commented 5 years ago

Please tell me if you have any progress about locating the problem, thanks.

qiuguoping commented 5 years ago

int addr_code = verify_addr_section(addr_syms_addrs); we tested many times, and found that there is no problem usually when addr_code ==2, when addr_code ==1, there are some address differences between system.map and symbol table extracted by tool/ext_sym.

qiuguoping commented 5 years ago

I don't know if this case is brought by compile option, if you know it, please tell me, I will check it.

cpumask commented 5 years ago

Hi, Guoping,

Thanks for the testing. "addr_code ==1" means that the symbol table in the kernel image needs to be relocated/adjusted before it can be used, the relocation information is stored in some separate sections in the image file which the "ext_sym" tool needs to locate and parse. This is a much more complicated case than "addr_code ==2". It seems that the kernel with KASLR enabled needs to use the relocation for symbol table (https://lwn.net/Articles/673598/), I think it may not have many things to do with the compiler options.

It's more difficult than I thought to obtain public documents regarding the format/location of the relocation sections in the kernel image. So I basically reverse engineered some kernel images to learn about the relocation sections when writting "tool/ext_sym". It's possible that the kernel you use has a slightly different relocation section format, but without the actual kernel image, it's really hard for us to figure it out....

I'm thinking that maybe we can first try to use the extracted symbol table and see whether it can normally work, in the meanwhile, it will be great if you can find an open-source kernel image that has the same "system.map" and "ext_sym" mismatch issue which we can take a look.

qiuguoping commented 5 years ago

thanks for your reply, I will try to test some open-source kernel images, if have similar problems, and send you to locate it.

qiuguoping commented 5 years ago

Hi cpumask,

I took several days to locate this problem, and maybe found two bugs, I send some information to you, you can check if two bugs are true.
found 1: pointer p can be increased two times in function try_locate_relo_symaddr, and result in skipping one address. for(;p+g2<e;++p){ ...... if(i<g1-1){ p+=(i+1); continue; } ...... } modified: for(;p+g2<e;){

found 2: Not skip addr_syms_names at the same time when pt is NULL in main function, and result in address difference one or more between system.map and symbol table exacted by tool/ext_sym. int main(int argc, char **argv){ .... if(!pt) continue; .... }

 modified:
    ....
    if(!pt){
        p = skip_cur_addr_sym( p );
        continue;
    }
    ....
   char* skip_cur_addr_sym( char *p ){
        int len;

        if ( p == NULL )
              return NULL;

        len=*(int*)p;
        len&=0xff;
        p += len + 1;

        return p;
 }
cpumask commented 5 years ago

Hi, guoping,

Thanks so much for your efforts! I think you are right about the two bugs. Could you create a pull request for the bug fixes? I can fix them myself but I think you should have the credits :)

BTW, after applying these fixes, are there still differences between "system.map" and tool extracted symbol table? Thanks!

qiuguoping commented 5 years ago

Thanks for your trusting me, You mean I download the fiber source code by git and fix these bugs and then upload these modified file by git, if so, I am sorry to say that I can't upload files outside because of corporation's information security. after applying these fixes, there are still a little number of differences between "system.map" and tool extracted symbol table, most of the differences don't exist in the other side.

qiuguoping commented 5 years ago

By the way, I ask you that when to release new edition of fiber, I am looking forward to solving the problem of the inline function.

cpumask commented 5 years ago

I see. Then I'll make the bug fixes myself, thanks again for your help! Regarding the inline problem, unfortunately we are still testing it now, I'll let you know when the code is available. Thanks!

qiuguoping commented 5 years ago

Thank you