radareorg / radare2-r2pipe

Access radare2 via pipe from any programming language!
389 stars 99 forks source link

python r2pipe is suddenly failing on the binary info command `iIj` #137

Closed qkaiser closed 3 years ago

qkaiser commented 3 years ago

Environment

Description

r2pipe is suddenly failing on the binary info command iIj. We were previously using r2pipe v1.5.3 and since the move to r2pipe v1.6.0 we're observing erratic behavior when running that command using cmdj.

Test

This can be reproduced with the following script and running it against a small set of binary samples repeatedly.

#!/usr/bin/env python3
import r2pipe
import sys

_pipe = r2pipe.open(sys.argv[1], flags=["-2", "-S"])
res = _pipe.cmdj('iIj')
if res is None:
    print("{} - FAIL".format(sys.argv[1]))

This is the kind of results we observed:

for i in {0..20}; do echo "# RUN ${i}"; find samples -type f -exec python3 repro.py {} \; ; done
# RUN 0
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/593669cd24a1822f17b9d3e92fc066ac07eea4cb - FAIL
# RUN 1
# RUN 2
# RUN 3
# RUN 4
# RUN 5
# RUN 6
# RUN 7
# RUN 8
# RUN 9
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/8d5036cd3597676f5d5db0bc32742f3dc8e2c0e2 - FAIL
# RUN 10
# RUN 11
# RUN 12
# RUN 13
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/593669cd24a1822f17b9d3e92fc066ac07eea4cb - FAIL
# RUN 14
# RUN 15
# RUN 16
# RUN 17
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/928ad0fdff63fbf6c88b2e92d82539445cbb6472 - FAIL
# RUN 18
# RUN 19
# RUN 20

This is not due to the binaries themselves given that it ultimately works if you run the command repeatedly like the ugly script below:

#!/usr/bin/env python3
import r2pipe
import sys

_pipe = r2pipe.open(sys.argv[1], flags=["-2", "-S"])
res = None
i = 0
while res is None:
    res = _pipe.cmdj('iIj')
    i += 1

if i>1:
    print("{} - {} attempts".format(sys.argv[1], i))

Fix ?

I did some manual testing to identify the commit that trigger this behavior. It appeared with commit a7654a599fd04dd7c24a5f21e9437e9bad7a06a6. The exact reason behind it is unknown.

qkaiser commented 3 years ago

This is indeed due to commit a7654a599fd04dd7c24a5f21e9437e9bad7a06a6.

This commit introduced the following code:

diff --git a/python/r2pipe/open_sync.py b/python/r2pipe/open_sync.py
index 307e12d..4346b04 100644
--- a/python/r2pipe/open_sync.py
+++ b/python/r2pipe/open_sync.py
@@ -67,6 +67,10 @@ class open(OpenBase):
             except:
                 raise Exception("ERROR: Cannot find radare2 in PATH")
             self.process.stdout.read(1)  # Reads initial \x00
+            try:
+                self.process.stdin.write(("?V\n").encode("utf8"))
+            except:
+                raise Exception("ERROR: Cannot open %s" % filename)
             # make it non-blocking to speedup reading
             self.nonblocking = True
             if self.nonblocking:

A command is sent to get the radare2 version (?V) when testing for file open, but the output buffer is not read out so it will appear when r2pipe read from the buffer once the iIj command is sent out. This will trigger a JSON parsing issue.

Expected output:

{"arch":"mips","baddr":4194304,"binsz":167451,"bintype":"elf","bits":32,"canary":false,"class":"ELF32","compiled":"","compiler":"GCC: (GNU) 3.3.2","crypto":false,"dbg_file":"","endian":"little","havecode":true,"guid":"","intrp":"/lib/ld-uClibc.so.0","laddr":0,"lang":"c","linenum":false,"lsyms":false,"machine":"MIPS R3000","maxopsz":4,"minopsz":4,"nx":false,"os":"linux","cc":"","pcalign":4,"pic":false,"relocs":false,"relro":"no","rpath":"NONE","sanitiz":false,"static":false,"stripped":true,"subsys":"linux","va":true,"checksums":{}}

Actual output:

4.6.0-git aka 4.5.1 commit 26465
{"arch":"mips","baddr":4194304,"binsz":167451,"bintype":"elf","bits":32,"canary":false,"class":"ELF32","compiled":"","compiler":"GCC: (GNU) 3.3.2","crypto":false,"dbg_file":"","endian":"little","havecode":true,"guid":"","intrp":"/lib/ld-uClibc.so.0","laddr":0,"lang":"c","linenum":false,"lsyms":false,"machine":"MIPS R3000","maxopsz":4,"minopsz":4,"nx":false,"os":"linux","cc":"","pcalign":4,"pic":false,"relocs":false,"relro":"no","rpath":"NONE","sanitiz":false,"static":false,"stripped":true,"subsys":"linux","va":true,"checksums":{}}
trufae commented 3 years ago

I can't reproduce, so i assume it's because you are using a very old version of r2. Please install r2-5.3 or the last from git

as0ler commented 3 years ago

It works with last r2 version. I couldn't reproduce the issue.

image

image

We will close this issue. Please do not hesitate to reopen this ticket if it is still not working after update radare2 ;)

qkaiser commented 3 years ago

I checked again with the following r2 version:

r2 -v
radare2 5.4.0 26489 @ linux-x86-64 git.5.4.0-git
commit: 701363f61ceb02599589621ed0597da09c32f7aa build: 2021-08-20__15:55:12

And r2pipe v1.5.5:

>>> import r2pipe
>>> r2pipe.version()
'1.5.5'

The error still triggers randomly:

for i in {0..20}; do echo "# RUN ${i}"; find samples -type f -exec python3 repro.py {} \; ; done
# RUN 0
# RUN 1
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/8d5036cd3597676f5d5db0bc32742f3dc8e2c0e2 - 2 attempts

I looked at the result string parsed by cmdj by making the following change:

diff --git a/python/r2pipe/open_base.py b/python/r2pipe/open_base.py
index 9bc4985..2e1b1b0 100644
--- a/python/r2pipe/open_base.py
+++ b/python/r2pipe/open_base.py
@@ -245,10 +245,10 @@ class OpenBase(object):
                     Returns a JSON object respresenting the parsed JSON
                 """
         result = self.cmd(cmd, **kwargs)

         try:
             data = json.loads(result)
         except (ValueError, KeyError, TypeError) as e:
+           sys.stderr.write('output buf content: {}\n'.format(result.encode('utf-8').hex()))
             sys.stderr.write("r2pipe.cmdj.Error: %s\n" % (e))
             data = None
         return data

And discovered that, sometimes, the result string starts with a null byte:

for i in {0..20}; do echo "# RUN ${i}"; find samples-type f -exec python3 repro.py {} \; ; done
# RUN 0
output buf content: 007b2261726368223a226d69707322...

This is exactly what triggers this error:

>>> json.loads('\x00{"arch":"arm","baddr":32768,"binsz":152369,"bintype":"elf","bits":32,"canary":false,"class":"ELF32","compiled":"","compiler":"GCC: (GNU) 3.3.2 20031005 (Debian prerelease) GCC: (Buildroot 2010.05) 4.3.3","crypto":false,"dbg_file":"","endian":"little","havecode":true,"guid":"","intrp":"/lib/ld-uClibc.so.0","laddr":0,"lang":"c","linenum":false,"lsyms":false,"machine":"ARM","nx":true,"os":"linux","cc":"","pic":false,"relocs":false,"relro":"no","rpath":"NONE","sanitize":false,"static":false,"stripped":true,"subsys":"linux","va":true,"checksums":{}}\n')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

An easy fix is the one below, although it may be better to identify the null byte source:

diff --git a/python/r2pipe/open_base.py b/python/r2pipe/open_base.py
index 9bc4985..e259171 100644
--- a/python/r2pipe/open_base.py
+++ b/python/r2pipe/open_base.py
@@ -247,7 +247,7 @@ class OpenBase(object):
         result = self.cmd(cmd, **kwargs)

         try:
-            data = json.loads(result)
+            data = json.loads(result.strip('\x00'))
         except (ValueError, KeyError, TypeError) as e:
             sys.stderr.write("r2pipe.cmdj.Error: %s\n" % (e))
             data = None
trufae commented 3 years ago

Im fine to include a workaround for now, but that null byte trim should happen on cmd. Not cmdj. And would be good to investigate further to understand that racy bug and do the proper fix. Can you submit a pr with the workaround?

On 23 Aug 2021, at 11:40, Quentin Kaiser @.***> wrote:

 I checked again with the following r2 version:

r2 -v radare2 5.4.0 26489 @ linux-x86-64 git.5.4.0-git commit: 701363f61ceb02599589621ed0597da09c32f7aa build: 2021-08-20__15:55:12 And r2pipe v1.5.5:

import r2pipe r2pipe.version() '1.5.5' The error still triggers randomly:

for i in {0..20}; do echo "# RUN ${i}"; find samples -type f -exec python3 repro.py {} \; ; done

RUN 0

RUN 1

r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0) samples/8d5036cd3597676f5d5db0bc32742f3dc8e2c0e2 - 2 attempts I looked at the result string parsed by cmdj by making the following change:

diff --git a/python/r2pipe/open_base.py b/python/r2pipe/open_base.py index 9bc4985..2e1b1b0 100644 --- a/python/r2pipe/open_base.py +++ b/python/r2pipe/open_base.py @@ -245,10 +245,10 @@ class OpenBase(object): Returns a JSON object respresenting the parsed JSON """ result = self.cmd(cmd, **kwargs)

     try:
         data = json.loads(result)
     except (ValueError, KeyError, TypeError) as e:
  • sys.stderr.write('output buf content: {}\n'.format(result.encode('utf-8').hex())) sys.stderr.write("r2pipe.cmdj.Error: %s\n" % (e)) data = None return data And discovered that, sometimes, the result string starts with a null byte:

for i in {0..20}; do echo "# RUN ${i}"; find samples-type f -exec python3 repro.py {} \; ; done

RUN 0

output buf content: 007b2261726368223a226d69707322... This is exactly what triggers this error:

json.loads('\x00{"arch":"arm","baddr":32768,"binsz":152369,"bintype":"elf","bits":32,"canary":false,"class":"ELF32","compiled":"","compiler":"GCC: (GNU) 3.3.2 20031005 (Debian prerelease) GCC: (Buildroot 2010.05) 4.3.3","crypto":false,"dbg_file":"","endian":"little","havecode":true,"guid":"","intrp":"/lib/ld-uClibc.so.0","laddr":0,"lang":"c","linenum":false,"lsyms":false,"machine":"ARM","nx":true,"os":"linux","cc":"","pic":false,"relocs":false,"relro":"no","rpath":"NONE","sanitize":false,"static":false,"stripped":true,"subsys":"linux","va":true,"checksums":{}}\n') Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.8/json/init.py", line 357, in loads return _default_decoder.decode(s) File "/usr/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) An easy fix is the one below, although it may be better to identify the null byte source:

diff --git a/python/r2pipe/open_base.py b/python/r2pipe/open_base.py index 9bc4985..e259171 100644 --- a/python/r2pipe/open_base.py +++ b/python/r2pipe/open_base.py @@ -247,7 +247,7 @@ class OpenBase(object): result = self.cmd(cmd, **kwargs)

     try:
  • data = json.loads(result)
  • data = json.loads(result.strip('\x00')) except (ValueError, KeyError, TypeError) as e: sys.stderr.write("r2pipe.cmdj.Error: %s\n" % (e)) data = None — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
radare commented 3 years ago

The null byte is the output terminator so this could be a bug in the way the output of r2 is parsed in python. But the random nature of this bug is weird, and makes me think the nonblocking reads are sometimes racy in the way the current code works.

On 23 Aug 2021, at 13:05, pancake @.***> wrote:

 Im fine to include a workaround for now, but that null byte trim should happen on cmd. Not cmdj. And would be good to investigate further to understand that racy bug and do the proper fix. Can you submit a pr with the workaround?

On 23 Aug 2021, at 11:40, Quentin Kaiser @.***> wrote:

 I checked again with the following r2 version:

r2 -v radare2 5.4.0 26489 @ linux-x86-64 git.5.4.0-git commit: 701363f61ceb02599589621ed0597da09c32f7aa build: 2021-08-20__15:55:12 And r2pipe v1.5.5:

import r2pipe r2pipe.version() '1.5.5' The error still triggers randomly:

for i in {0..20}; do echo "# RUN ${i}"; find samples -type f -exec python3 repro.py {} \; ; done

RUN 0

RUN 1

r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0) samples/8d5036cd3597676f5d5db0bc32742f3dc8e2c0e2 - 2 attempts I looked at the result string parsed by cmdj by making the following change:

diff --git a/python/r2pipe/open_base.py b/python/r2pipe/open_base.py index 9bc4985..2e1b1b0 100644 --- a/python/r2pipe/open_base.py +++ b/python/r2pipe/open_base.py @@ -245,10 +245,10 @@ class OpenBase(object): Returns a JSON object respresenting the parsed JSON """ result = self.cmd(cmd, **kwargs)

try: data = json.loads(result) except (ValueError, KeyError, TypeError) as e:

  • sys.stderr.write('output buf content: {}\n'.format(result.encode('utf-8').hex())) sys.stderr.write("r2pipe.cmdj.Error: %s\n" % (e)) data = None return data And discovered that, sometimes, the result string starts with a null byte:

for i in {0..20}; do echo "# RUN ${i}"; find samples-type f -exec python3 repro.py {} \; ; done

RUN 0

output buf content: 007b2261726368223a226d69707322... This is exactly what triggers this error:

json.loads('\x00{"arch":"arm","baddr":32768,"binsz":152369,"bintype":"elf","bits":32,"canary":false,"class":"ELF32","compiled":"","compiler":"GCC: (GNU) 3.3.2 20031005 (Debian prerelease) GCC: (Buildroot 2010.05) 4.3.3","crypto":false,"dbg_file":"","endian":"little","havecode":true,"guid":"","intrp":"/lib/ld-uClibc.so.0","laddr":0,"lang":"c","linenum":false,"lsyms":false,"machine":"ARM","nx":true,"os":"linux","cc":"","pic":false,"relocs":false,"relro":"no","rpath":"NONE","sanitize":false,"static":false,"stripped":true,"subsys":"linux","va":true,"checksums":{}}\n') Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.8/json/init.py", line 357, in loads return _default_decoder.decode(s) File "/usr/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) An easy fix is the one below, although it may be better to identify the null byte source:

diff --git a/python/r2pipe/open_base.py b/python/r2pipe/open_base.py index 9bc4985..e259171 100644 --- a/python/r2pipe/open_base.py +++ b/python/r2pipe/open_base.py @@ -247,7 +247,7 @@ class OpenBase(object): result = self.cmd(cmd, **kwargs)

try:

  • data = json.loads(result)
  • data = json.loads(result.strip('\x00')) except (ValueError, KeyError, TypeError) as e: sys.stderr.write("r2pipe.cmdj.Error: %s\n" % (e)) data = None — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
enovella commented 3 years ago

Getting a similar error with r2pipe v1.6.0 (and not in v.1.5.3):

r2pipe.cmdj.Error: Extra data: line 1 column 4 (char 3)
trufae commented 3 years ago

I can repro. Investigating

trufae commented 3 years ago

Uhm i fixed some issues in r2 and r2pipe along this time, can you retry?

qkaiser commented 3 years ago

Hi !

So I checked and the race condition still exists, but this time for a different reason.

The error was previously triggered due to the presence of a null byte at the beginning of the result string. Now the result string is simply empty.

for i in {0..20}; do echo "# RUN ${i}"; find samples -type f -exec python3 repro.py {} \; ; done
# RUN 0
# RUN 1
output buf content: 
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/vuln/26907a7fdfadd11c4f0d5786a3327fdfd4f787c4 - 2 attempts
output buf content: 
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/fixed/ddf7c27336cb8a6d7b5128884aa10a4d2ae65a14 - 2 attempts
# RUN 2
output buf content: 
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/vuln/af8fe3784e623b5b72eadfbb8675e6c1d2e9c9d4 - 2 attempts
output buf content: 
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/fixed/b1e92d58c925e77dc0042a583fd91d48b362fba6 - 2 attempts
output buf content: 
r2pipe.cmdj.Error: Expecting value: line 1 column 1 (char 0)
samples/fixed/8d5036cd3597676f5d5db0bc32742f3dc8e2c0e2 - 2 attempts

This is the equivalent of doing:

>>> json.loads('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

r2 version in use:

r2 -v
radare2 5.5.1 27053 @ linux-x86-64 git.5.4.0
commit: 120f43583a282c662dde5dbd82665421a9dff6c5 build: 2021-11-23__09:30:22
>>> r2pipe.version()
'1.6.4'

I saw your tentative fix at https://github.com/radareorg/radare2-r2pipe/commit/9706a2eeecee9e24b223a463dd531cf96a272a95, will take a deeper look to see if I can spot anything :)

trufae commented 3 years ago

fixed in master (and released 1.6.5) thanks!

qkaiser commented 3 years ago

Just confirmed it works, thanks a lot !