FernandoDoming / r2diaphora

Port of the binary diffing library, diaphora, for radare2 and mariadb
GNU Affero General Public License v3.0
46 stars 1 forks source link

xml parser error #5

Open Ret2c7 opened 1 year ago

Ret2c7 commented 1 year ago

Descript:

It looks like an issue with XML parsing, trying to find the problem, but i without any clue. Due to the error being too long and appearing to be the same issue, I only truncated a portion.

I used the following command:

r2diaphora server server2

The error is as follows:

Traceback (most recent call last):
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/diaphora_r2.py", line 289, in decompile_and_get
    self.pseudo_hash[ea] = calc_pseudo_hash(ea)
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/idaapi/idaapi_to_r2.py", line 347, in calc_pseudo_hash
    tree = ET.ElementTree(ET.fromstring(xml))
  File "/usr/lib/python3.8/xml/etree/ElementTree.py", line 1321, in XML
    return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
2023-07-26 17:42:03,195 [INFO] - Exported sym.transcmp fn (95/992). Elapsed 19 s, remaining time ~186 s
2023-07-26 17:42:03,238 [ERROR] - Exception while calculating pseudocode primes hash for function 0x406210
Traceback (most recent call last):
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/diaphora_r2.py", line 289, in decompile_and_get
    self.pseudo_hash[ea] = calc_pseudo_hash(ea)
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/idaapi/idaapi_to_r2.py", line 347, in calc_pseudo_hash
    tree = ET.ElementTree(ET.fromstring(xml))
  File "/usr/lib/python3.8/xml/etree/ElementTree.py", line 1321, in XML
    return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
2023-07-26 17:42:03,248 [INFO] - Exported sym.alias_compare fn (96/992). Elapsed 19 s, remaining time ~185 s
2023-07-26 17:42:03,503 [ERROR] - Exception while calculating pseudocode primes hash for function 0x406230
Traceback (most recent call last):
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/diaphora_r2.py", line 289, in decompile_and_get
    self.pseudo_hash[ea] = calc_pseudo_hash(ea)
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/idaapi/idaapi_to_r2.py", line 347, in calc_pseudo_hash
    tree = ET.ElementTree(ET.fromstring(xml))
  File "/usr/lib/python3.8/xml/etree/ElementTree.py", line 1321, in XML
    return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
2023-07-26 17:42:03,899 [INFO] - Exported sym.read_alias_file fn (97/992). Elapsed 20 s, remaining time ~189 s
2023-07-26 17:43:05,057 [WARNING] - Timeout while reading function at 0x4223568 from file server
2023-07-26 17:44:10,263 [WARNING] - Timeout while reading function at 0x4350752 from file server
2023-07-26 17:44:36,976 [ERROR] - Exception while calculating pseudocode primes hash for function 0x4071f0
Traceback (most recent call last):
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/diaphora_r2.py", line 289, in decompile_and_get
    self.pseudo_hash[ea] = calc_pseudo_hash(ea)
  File "/home/c7/.local/lib/python3.8/site-packages/r2diaphora/idaapi/idaapi_to_r2.py", line 347, in calc_pseudo_hash
    tree = ET.ElementTree(ET.fromstring(xml))
  File "/usr/lib/python3.8/xml/etree/ElementTree.py", line 1321, in XML
    return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
2023-07-26 17:44:53,721 [INFO] - Exported sym.__gettext_free_exp fn (100/992). Elapsed 190 s, remaining time ~1697 s
2023-07-26 17:46:02,637 [ERROR] - NO BASIC BLOCKS FOR 4238880

I can provide the files I tested, hoping it can be helpful. binary.zip

FernandoDoming commented 1 year ago

This is most likely due to an outdated version of r2ghidra, please try to update r2ghidra with r2pm -ci r2ghidra. If it still fails please post radare2, r2ghidra and r2diaphora versions.

Ret2c7 commented 1 year ago

Great! The issues related to XML seem to have been resolved, but every time [ERROR] - NO BASIC BLOCKS FOR appears, my computer gets stuck and I don't know what the problem is. This issue also occurred when there were issues related to XML before. But now the issues related to XML seem to have been resolved, but NO BASIC BLOCKS FOR still appears and will still cause my computer to get stuck. For example, the last line of the error message I provided above.

FernandoDoming commented 1 year ago

As you can see from the code here: https://github.com/FernandoDoming/r2diaphora/blob/master/r2diaphora/idaapi/idaapi_to_r2.py#L230 that error means afbj @ <addr> in radare2 failed. I attempted to reproduce the error in my system using the address shown in your logs but sadly I could not:

$ r2 ~/Downloads/binary/server
 -- Ceci n'est pas une r2pipe
[0x00401ae0]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Type matching analysis for all functions (aaft)
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Finding function preludes (aap)
INFO: Enable anal.types.constraint for experimental type propagation
[0x00401ae0]> afbj @ 4238880 | jq | more
[
  {
    "addr": 4238640,
    "size": 25,
    "jump": 4252533,
    "fail": 4238665,
    "opaddr": 18446744073709552000,
    "inputs": 0,
    "outputs": 2,
    "ninstr": 10,
    "instrs": [
      4238640,
      4238642,
      4238644,
      4238646,
      4238648,
      4238651,
      4238652,
      4238653,
      4238657,
      4238659
    ],
    "traced": 1
  },
  {
    "addr": 4238665,
    "size": 21,
    "jump": 4238910,
    "fail": 4238686,
    "opaddr": 18446744073709552000,
    "inputs": 1,
    "outputs": 2,
    "ninstr": 5,
    "instrs": [
      4238665,
      4238669,
      4238672,
      4238675,
      4238680
    ],
    "traced": 1
  },
  [...]

I also tested in server2 with the same results. Can you do the same in your machine and post results?

Ret2c7 commented 1 year ago

Sorry for replying to you so late. 4238880 does not seem to have the problem of NO BASIC BLOCKS FOR. But it appeared in another place 4389338.

2023-07-31 15:49:32,316 [INFO] - Exported sym.call_dl_lookup fn (276/992). Elapsed 337 s, remaining time ~876 s
2023-07-31 15:49:32,423 [INFO] - Exported sym.__dup fn (277/992). Elapsed 337 s, remaining time ~871 s
2023-07-31 15:49:32,597 [INFO] - Exported sym.__sfp_handle_exceptions fn (278/992). Elapsed 337 s, remaining time ~868 s
2023-07-31 15:49:33,426 [INFO] - Exported sym.tsearch fn (279/992). Elapsed 338 s, remaining time ~865 s
2023-07-31 15:50:33,472 [ERROR] - NO BASIC BLOCKS FOR 4389338

I tried your method, but it seems to work.

[0x00401ae0]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Type matching analysis for all functions (aaft)
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Finding function preludes (aap)
INFO: Enable anal.types.constraint for experimental type propagation
[0x00401ae0]> afbj @ 4389338 | jq | more
[
  {
    "addr": 4377728,
    "size": 13,
    "jump": 4383940,
    "fail": 4377741,
    "opaddr": 18446744073709552000,
    "inputs": 0,
    "outputs": 2,
    "ninstr": 3,
    "instrs": [
      4377728,
      4377732,
      4377735
    ],
    "traced": 1
  },
  {
    "addr": 4377741,
    "size": 10,
    "jump": 4383952,
    "fail": 4377751,
    "opaddr": 18446744073709552000,
    "inputs": 1,
    "outputs": 2,
    "ninstr": 2,
    "instrs": [
      4377741,
      4377745
    ],
    "traced": 1
  },
  {
    "addr": 4377751,
    "size": 20,
    "jump": 4377856,
    "fail": 4377771,
--More--

my radare2 version is:

radare2 5.8.9 30962 @ linux-x86-64
birth: git.5.8.8-361-gaa72538816 2023-07-26__10:37:53
commit: aa72538816ef107736b5f2150a9fbba05542b2fa
options: gpl -O? cs:5 cl:2 make

r2diaphora version maybe is:

MariaDB [1aca930421628ccdf8cec483f335c7c02f5612160cd76e7ffd73d6262c04d979]> select * from version;
+-------+
| value |
+-------+
| 2.0.6 |
+-------+

and i didn't know how to check the version of r2ghidra :)