GrammaTech / ddisasm

A fast and accurate disassembler
https://grammatech.github.io/ddisasm/
GNU Affero General Public License v3.0
645 stars 60 forks source link

casting from number to unsigned can cause in x86_32 mode have 64bit size offset #35

Closed StarGazerM closed 2 years ago

StarGazerM commented 2 years ago

Hi: When I am reading souffle code in src/datalog/value_analysis.dl.

// mov DWORD from memory
value_reg_edge(EA,Reg,EA,"NONE",0,Val):-
    def_used_for_address(EA,Reg),
    arch.move_operation(Operation),
    instruction(EA,_,_,Operation,Op1,Op2,0,0,_,_),
    op_regdirect_contains_reg(Op2,Reg),
    op_indirect(Op1,"NONE","NONE","NONE",_,Offset,32),
    Offset_addr = as(Offset,address),                                           // here
    data_byte(Offset_addr,Byte0),
    data_byte(Offset_addr+1,Byte1),
    data_byte(Offset_addr+2,Byte2),
    data_byte(Offset_addr+3,Byte3),
    Byte3 <= 128,
    Val = as(Byte3*2^24+ Byte2*2^16 + Byte1*2^8 + Byte0,number).

I found in line 123, a variable has type number is casted to address(unsigned). ddisasm require enable 64bit domain when compiling souffle, so this casting will always make Offset_addr become a u64 address even when handling 32bit binary. For example, -1 will be converted into 18446744073709551615, while in 32bit mode it should be 4294967295.

Yihao

aeflores commented 2 years ago

Hi Yihao, I think you might be right. In this case, we should restrict the rule to only positive offsets. I'll make a fix.

aeflores commented 2 years ago

this is fixed now https://github.com/GrammaTech/ddisasm/commit/a51fb7b8f43b9d4522901d7774a49e3bbb433344

StarGazerM commented 2 years ago

This is not the only every places, every conversion from sign to unsigned number can cause overflow, if possible could you kindly check all datalog code base for pattern as\(.+,address\). I searched on repo there are a lot of possible overflow places. Thanks!