zrax / pycdc

C++ python bytecode disassembler and decompiler
GNU General Public License v3.0
3.39k stars 648 forks source link

Support Python 3.9 decompilation #450

Open zrax opened 9 months ago

zrax commented 9 months ago

Tasks

AndrewLiuZhangZong commented 8 months ago

When to support it

Bang1338 commented 8 months ago

When to support it

will support if you don't asking again

AngeloD2022 commented 7 months ago

Thank you for maintaining this awesome project. ❤️

johnniesong commented 7 months ago

Thank you for maintaining this awesome project. ❤️

leqnam commented 7 months ago

still waiting for JUMP_IF_NOT_EXC_MATCH

Bang1338 commented 7 months ago

still waiting for JUMP_IF_NOT_EXC_MATCH

give them sometime.

Ma5onic commented 6 months ago

@zrax, I think that I found some hints on how to handle some of the missing optcodes for python 3.9: https://github.com/greyblue9/unpyc37-3.10/blob/master/src/unpyc/unpyc3.py https://github.com/greyblue9/unpyc37-3.10/blob/master/src/unpyc/opcodes.py

I was getting JUMP_IF_NOT_EXC_MATCH, WITH_EXCEPT_START & RERAISE errors on 3.10 with pycdc but the script that I linked supports those optcodes. It's written in python but it might help figure out how to implement it in C++

Levak commented 5 months ago

Hi,

I do not bring good news: I have analyzed the bytecode of Python 3.8, 3.9, 3.10 and 3.11 in regard to try-except-finally blocks, which are super common in Python scripts, and they all change logic from one version to another. I am currently focusing on 3.9 on my spare time and needless to say, it's already a headache. I do not fully grasp pycdc's code base, but I did try other alternatives such as the one mentioned by @Ma5onic to no avail.

My first observation as for Python 3.9: JUMP_IF_NOT_EXC_MATCH rel_delta is equivalent to COMPARE_OP followed by POP_JUMP_IF_FALSE abs_delta.

My second observation as for Python 3.8/3.9: SETUP_FINALLY replaces both SETUP_FINALLY and SETUP_EXCEPT. An important detail is that in Python 3.9, the byte code for the finally block is duplicated: Meaning there will be junk to ignore at the end of the file when reconstructing code. In Python 3.10, the duplication of the finally block is even worse. In Python 3.11, they started using exception tables which radically changes the logic.

This is probably not going to come as a surprise to anybody but adding support for any of these versions in pycdc is IMHO "dead weight". It is already too late. The bytecode has evolved faster than decompilers could (not blaming their maintainers, this is just an observation) and if someone suddenly wants to write or update one, he isn't going to be trying to maintain all the skipped versions. He's going to go straight for 3.12 or higher, and increment the decompiler version major.

As an example, I am trying to decompile a .pyc made from Python 3.9 and it came to me that it was faster to decompile by hand than trying to adapt any decompiler.

Nonetheless, I will continue to tinker with pycdc on my side to try to make Python 3.9's try-except-finally block somehow work. I'm sure @zrax will understand the chest pain I'm having rn 😄

Levak commented 5 months ago

Good news, I've got a working try-except-finally candidate for Python 3.8 AND 3.9. It's probably buggy in some cases I did not find yet, but for simple scripts, it works. What I did was to emulate new and/or removed opcodes with already implemented opcodes so that I did not have to re-invent the wheel.

@zrax I've noticed the TODO list is missing a recent PR you merged, I also add the PR I just opened as a reference.

488 : WITH_EXCEPT_START

493 : RERAISE && JUMP_IF_NOT_EXC_MATCH