Closed codepainters closed 11 years ago
Following patch made it work for me:
diff --git a/uncompyle2/Scanner25.py b/uncompyle2/Scanner25.py
index 9fc3f4d..8c9d17d 100755
--- a/uncompyle2/Scanner25.py
+++ b/uncompyle2/Scanner25.py
@@ -91,11 +91,11 @@ class Scanner:
for x in toDel:
if self.code[x-delta] >= dis.HAVE_ARGUMENT:
self.code.pop(x-delta)
- self.restructCode(x-delta)
self.code.pop(x-delta)
- self.restructCode(x-delta)
self.code.pop(x-delta)
self.restructCode(x-delta)
+ self.restructCode(x-delta)
+ self.restructCode(x-delta)
delta += 3
else:
self.code.pop(x-delta)
In fact this code isn't wrong (but very crappy), I have to change the function name.
The role of the restructCode() function is to patch all jump opcode argument after I remove one byte from the bytecode. If I don't do that some jump goes to the wrong target.
"The call to restructCode() should be done only once,". You are right, and I've change that. Now function patch all jump target in one time. And then I delete the useless opcode.
I understand the role of this code. But I'm not really convinced it is correct - if it deletes 3-byte instruction byte by byte, and calls restructCode after each byte, how is it protected from treating 2nd and 3rd instruction byte as opcode?
I've seen it actually corrupting the code - restructCode
subtracted 1 from opcode itself when rewriting some "with" statement:
ParserError: --- This code section failed: ---
0 LOAD_FAST 'self'
3 LOAD_ATTR 'lock'
6 SETUP_WITH '120'
9 POP_TOP None
10 <123> None <-- input file is valid here!
13 LOAD_ATTR 'lower'
16 CALL_FUNCTION_0 None
19 LOAD_GLOBAL 'os'
22 LOAD_ATTR 'path'
25 LOAD_ATTR 'sep'
28 BINARY_ADD None
29 STORE_FAST 'local_path'
Said that, I'm not really sure if my patch is safe and correct (this code is fairly complex) - yet it solved my problem at hand ;)
Anyway, I'll try to prepare a simple isolated test.
OK, here's the code that failed decompiling:
from __future__ import with_statement
def clear_with_prefix(self, local_path):
with self.lock:
local_path = local_path.lower() + os.path.sep
to_del = []
for a in self.d:
if a.startswith(local_path):
to_del.append(a)
for elt in to_del:
del self.d[elt]
It worked OK after my dirty hack (I just changed the order, as original restructCode can deal with only 1 byte shift). I'll try your fix later, hope it will solve other failures I observe :)
FYI, your latest code (c3a874220fcebf7241d40d234d325ac3273ff3ae) now crashes with the sample above:
# 2012.09.20 20:53:40 CEST
#Embedded file name: o.py
# Can't uncompyle o.pyc
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/__init__.py", line 200, in main
uncompyle_file(infile, outstream, showasm, showast)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/__init__.py", line 137, in uncompyle_file
uncompyle(version, co, outstream, showasm, showast)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/__init__.py", line 126, in uncompyle
walker.gen_source(ast, customize)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Walker.py", line 1384, in gen_source
self.print_(self.traverse(ast, isLambda=isLambda))
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Walker.py", line 484, in traverse
self.preorder(node)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/spark.py", line 692, in preorder
self.preorder(kid)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/spark.py", line 692, in preorder
self.preorder(kid)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/spark.py", line 692, in preorder
self.preorder(kid)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/spark.py", line 687, in preorder
self.default(node)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Walker.py", line 1158, in default
self.engine(table[key], node)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Walker.py", line 1108, in engine
self.preorder(node[entry[arg]])
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/spark.py", line 685, in preorder
func(node)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Walker.py", line 862, in n_mkfunc
self.make_function(node, isLambda=0)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Walker.py", line 1259, in make_function
code = Code(code, self.scanner, self.currentclass)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Scanner.py", line 60, in __init__
self._tokens, self._customize = scanner.disassemble(co, classname)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Scanner25.py", line 91, in disassemble
self.restructCode(toDel)
File "/usr/local/lib/python2.7/dist-packages/uncompyle2/Scanner25.py", line 408, in restructCode
self.toChange[self.toChange.index(change)] -= self.op_size(self.code[toDel])
ValueError: 10 is not in list
# decompiled 0 files: 0 okay, 1 failed, 0 verify failed
# 2012.09.20 20:53:40 CEST
"if it deletes 3-byte instruction byte by byte, and calls restructCode after each byte, how is it protected from treating 2nd and 3rd instruction byte as opcode?"
Thanks to the first line of this code (from Scanner25) only jump opcode argument are affected :
for x in self.op_range(0, len(self.code)):
op = self.code[x]
if op >= HAVE_ARGUMENT:
if op in dis.hasjrel:
if x < i and self.get_target(x) > i:
...
def op_range(self, start, end):
while start < end:
yield start
start += self.op_size(self.code[start])
I change restructCode() for performance improvement and having clearest code.
Your behavior is due to the WITH_SETUP opcode. Because this opcode doesn't exist in the 2.5 bytecode table but is required for 2.7 bytecode.
So I have to create WITH_SETUP with other conditional jump (and it can be a wrong methode) :
if offset in self.toChange: # opcode to change to WITH_SETUP
if self.code[offset] == JA and self.code[oparg] == WITH_CLEANUP:
opname = 'SETUP_WITH'
cf[oparg] = cf.get(oparg, []) + [offset]
I'll try to improve handling of the with statement :) Thank you for reporting bug and debug information!
OK, I assumed incorrectly that restructCode() scans starting at i - while actually it always starts from index 0. Still, somehow restructCode() was corrupting the bytecode. I've spent a lot of time today dumping self.code at various places - after the loop with restructCode() it already was corrupted (as above - invalid opcode). Anyway, I can see you reworked this part significantly, so it's no longer relevant.
And as soon as I applied my workaround, "with" statement decompiled correctly, so I guess SETUP_WITH handling is itself correct (yet hackish) ! :)
Thanks for your support (pitty I have no time to dig it deeper, it's pretty interesting stuff).
I fiddled with the code myself. I think I stumpled upon the same problem when I disassembled a python 2.6 pyc file. I think it fixes this issue here because compiling https://github.com/Mysterie/uncompyle2/issues/2#issuecomment-8741094 using python 2.6 and then using uncompyle2 on it causes a similar error as https://github.com/Mysterie/uncompyle2/issues/2#issuecomment-8741575
The change for Scanner25.py should be exactly the same but I have not tried working with 2.5 pyc files.
Scanner26.py In the function restructCode the rearrange of toChange locations (so far used to mark with statements for python 27 ) fails because if toDel contains more than one valid entry then self.toChange.index(change) cannot find the same index again because the value it was looking for was changed in the previous loop.
My fix for uncompyle2/Scanner26.py
@@ -409,12 +409,15 @@ class Scanner:
result.append((block[0]+startBlock, block[1]))
self.linestarts = result
- for change in self.toChange:
+ for index in xrange(len(self.toChange)) :
+ change = self.toChange[index]
+ delta = 0
for toDel in listDel:
if change > toDel:
- self.toChange[self.toChange.index(change)] -= self.op_size(self.code[toDel])
+ delta += self.op_size(self.code[toDel])
else:
break
+ self.toChange[index] -= delta
for jmp in self.op_range(0, len(self.code)):
op = self.code[jmp]
Using xrange to iterate over self.toChange is not necessary but I wanted to get rid of the index method.
I've checked restructCode() function and you're right, it fail only with WITH_SETUP statement on the condition you explain. I've patched the code, thanks to your report and fix :)
I believe the following fragment of Scanner25.py is wrong:
For 3-byte instructions it tries to restruct the code after each byte removed - thus, if e.g. restructCode is called after the first byte is removed, it will treat second byte of the instruction as opcode.
The call to restructCode() should be done only once, after removing all 3 bytes. However, this means that restructCode() need to know if 1 or 3 bytes were removed.