Open bb33bb opened 2 months ago
"-1" in this line v_1 = -1 + (v & 0x3fffffffff) -1 is retived from symbol i think. But in vmlinux , i think in other cases , may be it is the same. In vmlinux, the data section have many variables with initialized values. These initialized values can be 0 or -1 or 0xffffffffffffffff If Ninja loads these values and use these values to deep analizes all functions , it is really a little aggressive. Because after the initialized values loaded into the function, the flow of function will change to a specific case which can not stand for or can not show with the full functions infomation to ninja users.
pure talking may be is really not ok to show it. here are 2 files in .ko file because the symbol is extern, and Ninja can not load the value of symbol i think. then the c code is OK as following:
00007294 int64_t run_command(int64_t arg1)
00007294 {
000072e0 int64_t x8_6 = ((arg1 - *(uint64_t*)(memstart_addr & 0xffff)) | 0xffffffc000000000);
000072f4 printk(0x8356, arg1, x8_6);
0000730c int64_t var_58;
0000730c
0000730c if ((x8_6 & 0x40) == 0)
0000735c var_58 = (x8_6 - *(uint64_t*)(kimage_voffset & 0xffff));
0000730c else
00007338 var_58 = ((x8_6 & 0x3fffffffff) + *(uint64_t*)(memstart_addr & 0xffff));
00007338
0000738c printk(0x8356, var_58, x8_6);
000073a0 return 0;
00007294 }
but if we are reversing vmlinux, the symbols according with data can be retrived , then the functions have a different c code as following:
ffffff800850d324 int diskstats_show(struct seq_file* seqf, void* v)
ffffff800850d324 {
ffffff800850d334 void* v_1;
ffffff800850d334
ffffff800850d334 if ((v & 0x40) != 0)
ffffff800850d350 v_1 = (-1 + (v & 0x3fffffffff));
ffffff800850d334 else
ffffff800850d340 v_1 = v;
ffffff800850d340
ffffff800850d360 printk("v1 : %lx\n", v_1);
ffffff800850d378 printk("p1 : %lx\n", (((char*)v_1 + 1) | 0xffffffc000000000));
ffffff800850d388 return 0;
ffffff800850d324 }
I did not revere it so thats just my guess Sorry to make a wrong guess if I am wrong
The reason why we used the value of the memstart_addr
variable is because it sits in a section with read-only semantics:
Surprisingly, we can also see several such read-only sections are actually in a segment marked as RWX
I am not sure how this should be handled, I will bring it up for some discussions
A workaround is to right-click on the .rodata
section, edit it, and set it to Writable Data
, then the analysis will be fixed:
A workaround is to right-click on the
.rodata
section, edit it, and set it toWritable Data
, then the analysis will be fixed:
谢谢! 这个问题确实是有时候有点点小小的迷惑。也许您提到的这个,设置为可写确实是我需要的。 这个Image是三星官方源码编译的,由于三星手机内核的一点点特殊,也许有可能是它尽量标记为只读,或者初始化后尽量标记为只读。或者也可能是vmlinux就这样。 我不知道我们是否有必要将这个静态分析,优化的选项前置,因为确实对准确性略有影响,或者还是我们考虑对齐其他同行产品? 目前我开始采用您提得办法设置读写来分析。
Thank you!
This issue is indeed a bit confusing sometimes. Perhaps what you mentioned about setting it to writable is indeed what I need.
This Image is compiled from Samsung's official source code. Due to some slight peculiarities in Samsung phone kernels, it's possible that they try to mark it as read-only, or mark it as read-only as much as possible after initialization. Or it could be that the vmlinux is just like this.
I'm not sure if we need to prioritize this static analysis and optimization option, as it does slightly affect accuracy, or if we should consider aligning with other peer products?
Currently, I've started using the method you suggested of setting it to read-write for analysis.
也许我们还是可以考虑下处理这个问题。如下所示: 在静态文件中,默认编译出来的初始化的数值,和真机中的确实还是不一样。 为了安全,三星手机内核在启动完,data段初始化结束后,会将一部分data段设置为只读。 于是出现了vmlinux中标记为只读,实际上启动后会是修改后的数值。 大概我的理解是这样。如果不正确,还望见谅。 如下所示是一个demo的截图 我在通过您提到的方法 https://github.com/Vector35/binaryninja-api/issues/5834#issuecomment-2286381557 修改属性后,可以显示出 memstart_addr符号
Perhaps we can still consider addressing this issue.
As shown below: In static files, the default compiled initialization values are indeed different from those
in real devices.
For safety, after booting and initializing the data segment, Samsung phone kernels set part of the data
segment as read-only.
This results in a situation where values marked as read-only in vmlinux are actually modified after startup.
This is roughly my understanding.
If it's incorrect, please forgive me.
Below is a screenshot of a demo showing that after modifying the attributes using the method you
mentioned in #5834 (comment),
I can display the memstart_addr symbol.
查看符号处的数据和真机中的数据如下所示 memstart_addr物理含义是,text段前面的固定的物理地址的起始地址。
open python console then paste the following python scripts:
for section in bv.sections.values():
print(f"Section name: {section.name}")
section.permissions = SectionSemantics.ReadWriteDataSectionSemantics
permissions
rodata_section.permissions = SectionSemantics.ReadWriteDataSectionSemantics it seems like this is no usefull
and i tried to use
rodata_section = bv.get_section_by_name('.rodata')
rodata_section.semantics = SectionSemantics.ReadWriteDataSectionSemantics
bv.update_analysis_and_wait()
the ret is
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: property 'semantics' of 'Section' object has no setter
is there any possible for us to use one line of python script finish this job?
I've updated the title based on some internal discussion.
The new memory map system is designed to unify and resolve these issues and once the roadmap for that is complete we expect this issue and similar ones to be resolved.
Version and Platform (required):
Bug Description: [after ninja load symbol when we need to F5, the c cocde ](https://github.com/Vector35/binaryninja-api/issues/5819)
The key problem: When I am reversing vmlinux, after loading vmlinux, Ninja will load symbols and get the value of symbols, then use the values in functions and output the static analysis answer based on the symbol data. But to be honest, a lot of symbol values will change after loading into the kernel. Here is a real case:
after compiled , Using ninja to open vmlinux. we can see the code as following:
But in ghidra we can see :
and we can check ,the ans is really different.