radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.08k stars 2.96k forks source link

pd gets off track on large instructions crossing 0x100 offset #21195

Open swoops opened 1 year ago

swoops commented 1 year ago

Environment

Fri Dec 23 06:21:14 PM EST 2022
radare2 5.8.1 30050 @ linux-x86-64 git.5.8.0-9-g341695d158
commit: 341695d1589386bcd56c2c3fd44c95cdfd0af00b build: 2022-12-23__17:22:16
Linux x86_64

Description

The pd command can get off track and does not jump the proper size forward before printing the next instruction. This error seems to hover around the 0x100 offset, so I suspect it has something to do with a buffer of that size.

The size seems to be reported by the plugin. This is evident because rasm2 is able to correctly jump forward.

Test

Below shows the behavior. The one byte op short_bincode (0x8c) at offset 0xb, is followed by 0xf4 for string length, then 0xf4 bytes for the string. This means the next op should be found on 0x00000101 but pd prints the invalid op for the last character of the string "Z" at 0x100.

$ r2 -a pickle -qqc 'pd 6' file.pickle
            0x00000000      8004           proto 0x4
            0x00000002      95f800000000.  frame 0xf8
            0x0000000b      8cf441424344.  short_binunicode "ABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCD"truncated ; 0xd
            0x00000100      5a             invalid
            0x00000101      94             memoize
            0x00000102      2e             stop

Rasm2, V and pickletools gets it right

$ rasm2 -Bda pickle -f file.pickle
proto 0x4
frame 0xf8
short_binunicode "ABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCD"truncated
memoize
stop
$ python3 -m pickletools file.pickle
    0: \x80 PROTO      4
    2: \x95 FRAME      248
   11: \x8c SHORT_BINUNICODE 'ABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDZZZZ'
  257: \x94 MEMOIZE    (as 0)
  258: .    STOP
highest protocol among opcodes = 4

V will mess up on a larger string though:

[0x00000000 [xAdvc]0 0% 285 log]> pd $r                                                                                                                                                                                                       
            0x00000000      8004           proto 0x4                                                                                                                                                                                          
            0x00000002      951701000000.  frame 0x117                                                                                                                                                                                        
            0x0000000b      581001000041.  binunicode "ABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCD"truncated    ; 0x10                                   
            0x0000011d      5a             invalid                                                                                                                                                                                            
            0x0000011e      5a             invalid                                                                                                                                                                                            
            0x0000011f      5a             invalid                                                                                                                                                                                            
            0x00000120      94             memoize                                                                                                                                                                                            
            0x00000121      2e             stop                                                                                                                                                                                               

But when I move the cursor closer to the 0xb opcode, V starts fixing itself:

[0x00000002 [xAdvc]0 0% 285 log]> pd $r                                                                                                                                                                                                       
            0x00000002      951701000000.  frame 0x117                                                                                                                                                                                        
            0x0000000b      581001000041.  binunicode "ABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCDABCD"truncated    ; 0x10                                   
            0x0000011f      5a             invalid                                                                                                                                                                                            
            0x00000120      94             memoize                                                                                                                                                                                            
            0x00000121      2e             stop                                                                                                                                                                                                                                                                                                                                                                                          

I will look into it eventually but I don't know much about pd.

files.zip

trufae commented 1 year ago

0x100 is the blocksize yo uhave configured probably

trufae commented 1 year ago

The size of the instruction right now cant be bigger than the provided buffer . And thats tied to the blocksize. Imho you should handle this case in the pickle plugin

oh wait.. how big is the maxinstsize the plugin reportint? Because if thats correct then theres a bug in pd

Lazula commented 1 year ago

The size of the instruction right now cant be bigger than the provided buffer

The updated pd is handles this in the retry section, it allocates a new buffer to use of a sufficient size (maxinstrsize+32). In this case it looks like this is occurring because pickle's R_ANAL_ARCHINFO_MAX_OP_SIZE is 129. So, what happens here is:

To confirm, changing libr/arch/p/pickle/plugin.c MAXSTRLEN to 256 fixes this since the buffer is large enough, but it would still fail with a sufficiently long string. So, the problem is the plugin's MAX_OP_SIZE. The snes plugin has the same problem of arbitrary length strings.

trufae commented 1 year ago

Imho this should be handled as payload. Not as the instruction size. Dalvik have similar instructions, the size of the payload is defined inside the instruction so it is not needed to read 1KB of data everytime we parse an instruction imho

Lazula commented 1 year ago

It looks like that's a predefined payload size and not for arbitrary length NULL-terminated strings, I'm not sure how this is intended to be implemented.

trufae commented 1 year ago

No intention to implement this but if anybody have an idea or proposal it seems like a good time to do it before breaking the abi