zrax / pycdc

C++ python bytecode disassembler and decompiler
GNU General Public License v3.0
3.23k stars 623 forks source link

Support Python 3.11 decompilation #452

Open zrax opened 7 months ago

zrax commented 7 months ago

Tasks

TiZCrocodile commented 7 months ago

POP_JUMP_BACKWARD_IF_FALSE POP_JUMP_BACKWARD_IF_TRUE I don't see a case in ASTree.cpp for these opcodes, I don't think its supported yet.

zrax commented 7 months ago

You're right, not sure why I marked those... Fixed now, thanks

TiZCrocodile commented 7 months ago

the opcodes POP_JUMP_FORWARD_IF_FALSE,POP_JUMP_FORWARD_IF_TRUE are supported, EDIT: POP_JUMP_BACKWARD_IF_FALSE,POP_JUMP_BACKWARD_IF_FALSE are not supported, just the forward

but anyways I wanted to ask you, how to do work on this project because the .gitignore file doesn't ignore visual studio files, then i see a lot of changes in the GitHub Desktop and its confusing. and also is there a way to communicate you about things in this project? i mean if for example i want to change all stack.top(); stack.pop(); lines to just call a function are you gonna approve this? and things like that, so i know what you want or not because i love this project, thank you very much :)

zrax commented 7 months ago

the opcodes POP_JUMP_FORWARD_IF_FALSE,POP_JUMP_FORWARD_IF_TRUE are supported, EDIT: POP_JUMP_BACKWARD_IF_FALSE,POP_JUMP_BACKWARD_IF_FALSE are not supported, just the forward

Thanks, fixed (again)

how to do work on this project because the .gitignore file doesn't ignore visual studio files, then i see a lot of changes in the GitHub Desktop and its confusing.

Personally, I do most of my development on it (in the rare opportunities that I have time to do so) from Linux or MSYS2, since that's where the test suite stuff works. However, more generally, when I'm using MSVC with CMake projects, I'll confine the build to a single directory that I can ignore entirely with a line in my local .git/info/exclude. Since there are many IDEs that all have their own mess of files, I don't generally bother putting those in .gitignore. That's just my personal preference though, I'm not opposed to someone else adding MSVC files to .gitignore if they know what to add.

if for example i want to change all stack.top(); stack.pop(); lines to just call a function are you gonna approve this? and things like that, so i know what you want or not because i love this project, thank you very much :)

The short history there is just that the stack was originally a std::stack<...>, so I kept the API compatible with STL for simplicity. Changing it to use a more ergonomic API just hasn't been a priority so far.

TiZCrocodile commented 7 months ago

the opcodes POP_JUMP_FORWARD_IF_FALSE,POP_JUMP_FORWARD_IF_TRUE are supported, EDIT: POP_JUMP_BACKWARD_IF_FALSE,POP_JUMP_BACKWARD_IF_FALSE are not supported, just the forward

Thanks, fixed (again)

how to do work on this project because the .gitignore file doesn't ignore visual studio files, then i see a lot of changes in the GitHub Desktop and its confusing.

Personally, I do most of my development on it (in the rare opportunities that I have time to do so) from Linux or MSYS2, since that's where the test suite stuff works. However, more generally, when I'm using MSVC with CMake projects, I'll confine the build to a single directory that I can ignore entirely with a line in my local .git/info/exclude. Since there are many IDEs that all have their own mess of files, I don't generally bother putting those in .gitignore. That's just my personal preference though, I'm not opposed to someone else adding MSVC files to .gitignore if they know what to add.

if for example i want to change all stack.top(); stack.pop(); lines to just call a function are you gonna approve this? and things like that, so i know what you want or not because i love this project, thank you very much :)

The short history there is just that the stack was originally a std::stack<...>, so I kept the API compatible with STL for simplicity. Changing it to use a more ergonomic API just hasn't been a priority so far.

ohh the .git/info/exclude and confine to one dir was the trick i guess, thank you very much, and i mean to change the stack.top / pop, because there is a function in ASTree.cpp in the start of the file named StackPopTop amd what it does is just pop and return, but is never used, but anyways how can i communicate with you in github? is there a chat or something instead of chatting in issues

sffool commented 6 months ago

Please support: JUMP-BACKWARD

greenozon commented 6 months ago

@sffool could you attach a sample .pyc with that opcode?

sffool commented 6 months ago

Tomorrow I will send a .pyc file that I want to decompile in an email.

Thank you!My hero.

---Original--- From: @.> Date: Wed, Mar 13, 2024 22:45 PM To: @.>; Cc: @.**@.>; Subject: Re: [zrax/pycdc] Support Python 3.11 decompilation (Issue #452)

@sffool could you attach a sample .pyc with that opcode?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

zrax commented 6 months ago

Please stop recommending people attach .pyc files... They are not useful. A much more useful recommendation is to provide a (small) bit of python source that compiles to the opcodes in question, so it can be used as a test case.

sffool commented 6 months ago

I'm sorry, I don't know how to write Python.

---Original--- From: "Michael @.> Date: Wed, Mar 13, 2024 23:22 PM To: @.>; Cc: @.**@.>; Subject: Re: [zrax/pycdc] Support Python 3.11 decompilation (Issue #452)

Please stop recommending people attach .pyc files... They are not useful. A much more useful recommendation is to provide a (small) bit of python source that compiles to the opcodes in question, so it can be used as a test case.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

greenozon commented 6 months ago

@zrax OK, but usually having pyc you could a) have some sample to play with b) create a test case in Python (after reading/understanding provided .pyc) and thus close the gap

kibernautas commented 6 months ago

I have been exploring the opcodes and what generates them, I know it aint much, but for me it was a nice way to have a list of the opcodes and their main functionality on when they are introduced, hope it can help.

import dis
import asyncio

class ContextManager:
    def __enter__(self):
        print("Entering context manager")
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        print("Exiting context manager")

def BEFORE_WITH():
    with ContextManager():
        print("Inside context manager")

def PUSH_EXC_INFO():
    try:
        raise Exception("This is an exception")
    except Exception as e:
        raise e

def CHECK_EXC_MATCH():
    try:
        raise ValueError("This is a ValueError")
    except (ValueError, TypeError) as e:
        print(f"Caught exception: {e}")

async def async_generator():
    await asyncio.sleep(0)
    yield 1
    await asyncio.sleep(0)
    yield 2

async def use_async_generator():
    async_gen = async_generator()
    try:
        value = await async_gen.asend(None)
        print(value)  
        value = await async_gen.asend(None)
        print(value)  
    finally:
        await async_gen.aclose()

def PREP_RERAISE_STAR():
    try:
        something = 1/1
        raise ValueError("This is a ValueError")
    except ValueError as e:
        try:
            raise Exception("This is an Exception") from e
        except Exception:
            print("Caught the re-raised exception")

def SWAP():
    my_array = [1, 2, 3, 4, 5]
    i = 1
    j = 3
    my_array[i], my_array[j] = my_array[j], my_array[i]

def COPY():
    a = 10
    b = 10
    c = 10
    return a == b == c

def subgenerator():
    yield 1
    yield 2
    yield 3

def SEND():
    yield from subgenerator()

def POP_JUMP_FORWARD_IF_NOT_NONE(item): # AND POP_JUMP_FORWARD_IF_NONE
    if item is None:  
        print('case 1')
    else:
        print('case 2')

    while item is None:
        pass

async def async_function():
    await asyncio.sleep(1)
    return 'Hello, World!'

async def GET_AWAITABLE():
    result = await async_function()
    print(result)

asyncio.run(GET_AWAITABLE())

def generator():
    yield 1
    yield 2
    yield 3

def JUMP_BACKWARD_NO_INTERRUPT():
    yield from generator()

gen = JUMP_BACKWARD_NO_INTERRUPT()
print(next(gen))
print(next(gen))
print(next(gen))

def JUMP_BACKWARD():
    iterable = [1, 2, 3]
    for item in iterable:
        pass

    while True:
        break

def COPY_FREE_VARS():
    a = 10
    def inner_function():
        nonlocal a
        print(a)
    inner_function()

def POP_JUMP_BACKWARD_IF_NOT_NONE():
    item = None
    while item is not None:
        pass

def POP_JUMP_BACKWARD_IF_NONE():
    item = None
    while item is None:
        pass

def POP_JUMP_BACKWARD_IF_FALSE():
    cond = False
    while not cond:
        pass

def POP_JUMP_BACKWARD_IF_TRUE():
    cond = True
    while cond:
        cond

if __name__ == "__main__":
    # print("BEFORE_WITH")
    # dis.dis(BEFORE_WITH) # BEFORE_WITH
    # print("PUSH_EXC_INFO")
    # dis.dis(PUSH_EXC_INFO) # PUSH_EXC_INFO
    # print("CHECK_EXC_MATCH")
    # dis.dis(CHECK_EXC_MATCH) # CHECK_EXC_MATCH
    # print("RETURN_GENERATOR | And ASYNC_GEN_WRAP when casted with asyncio.run")
    # dis.dis(use_async_generator()) # RETURN_GENERATOR | And ASYNC_GEN_WRAP when casted with asyncio.run
    # print("PREP_RERAISE_STAR")
    # dis.dis(PREP_RERAISE_STAR())
    # print("SWAP")
    # dis.dis(SWAP)
    #print("COPY")
    #dis.dis(COPY)
    #print("SEND")
    #dis.dis(SEND)
    #print("POP_JUMP_FORWARD_IF_NOT_NONE AND POP_JUMP_FORWARD_IF_NONE")
    #dis.dis(POP_JUMP_FORWARD_IF_NOT_NONE) # AND POP_JUMP_FORWARD_IF_NONE
    #print("GET_AWAITABLE")
    #dis.dis(GET_AWAITABLE) # GET_AWAITABLE
    #print("JUMP_BACKWARD_NO_INTERRUPT")
    #dis.dis(JUMP_BACKWARD_NO_INTERRUPT) # JUMP_BACKWARD_NO_INTERRUPT
    #print("JUMP_BACKWARD")
    #dis.dis(JUMP_BACKWARD) # JUMP_BACKWARD
    #print("COPY_FREE_VARS")
    #dis.dis(COPY_FREE_VARS) # COPY_FREE_VARS
    #print("POP_JUMP_BACKWARD_IF_NOT_NONE")
    #dis.dis(POP_JUMP_BACKWARD_IF_NOT_NONE) # POP_JUMP_BACKWARD_IF_NOT_NONE
    #print("POP_JUMP_BACKWARD_IF_NONE")
    #dis.dis(POP_JUMP_BACKWARD_IF_NONE) # POP_JUMP_BACKWARD_IF_NONE
    #print("POP_JUMP_BACKWARD_IF_FALSE")
    #dis.dis(POP_JUMP_BACKWARD_IF_FALSE) # POP_JUMP_BACKWARD_IF_FALSE
    #print("POP_JUMP_BACKWARD_IF_TRUE")
    #dis.dis(POP_JUMP_BACKWARD_IF_TRUE) # POP_JUMP_BACKWARD_IF_TRUE
greenozon commented 6 months ago

Please support: JUMP-BACKWARD

possible solution https://github.com/zrax/pycdc/pull/472

marfixdev commented 6 months ago

Unsupported opcode: COPY

Please add support for it. I can provide .pyc file if you want.

scylamb commented 4 months ago

Unsupported opcode: CALL_FUNCTION_EX https://docs.python.org/3/library/dis.html#opcode-CALL_FUNCTION_EX Added in version 3.11. Please support: CALL_FUNCTION_EX

uniplate commented 4 months ago

Unsupported opcode: CALL_FUNCTION_EX https://docs.python.org/3/library/dis.html#opcode-CALL_FUNCTION_EX Added in version 3.11. Please support: CALL_FUNCTION_EX

Also facing this error ! Please support: CALL_FUNCTION_EX

hanfangyuan4396 commented 4 months ago

I found an amazing method to deal the unsupported opcodes. I get the assembly codes with pycdas and then I give them to gpt-4o and let gpt-4o reverse them to python code. It did it!

stdedos commented 3 months ago

@zrax Do you accept maybe-source and bytecode for missing features?

Maybe in private?

Sun92Go commented 2 months ago

could you help support MAKE_CELL

NyanAlex commented 1 month ago

please add support for MAKE_CELL

greenozon commented 1 month ago

do you have some basic .py and .pyc that uses the MAKE_CELL opcode?

stdedos commented 1 month ago

do you have some basic .py and .pyc that uses the MAKE_CELL opcode?

Idk what you define as "basic". Is a script total of 105 lines (one non-stdlib dependency) "basic"?

greenozon commented 1 month ago

yeah, something like that (my samples, pyc-s, are producing almost 1M pycdas output in size...)

NyanAlex commented 1 month ago

do you have some basic .py and .pyc that uses the MAKE_CELL opcode?

Sorry for not answering for a long time!

import dis

def make_closure():
    x = 10
    def inner():
        return x
    return inner

def main():
    closure = make_closure()

    print("Result from closure:", closure())

    print("\nDisassembly of the closure function:")
    dis.dis(closure)

if __name__ == "__main__":
    main()
greenozon commented 1 month ago

I see no CELL opcodes:

python cell.py
Result from closure: 10

Disassembly of the closure function:
              0 COPY_FREE_VARS           1

  5           2 RESUME                   0

  6           4 LOAD_DEREF               0 (x)
              6 RETURN_VALUE
Layzzz66 commented 1 month ago

Sir @zrax , when I want to decode a .pyc script, I get an error like std badcast or bad magic, how do I solve it, I'm just a new person and don't know anything about this.

greenozon commented 1 month ago

@Layzzz66 please open up a new issue, attach your pyc - this is the common way...

NyanAlex commented 1 month ago

I see no CELL opcodes:

python cell.py
Result from closure: 10

Disassembly of the closure function:
              0 COPY_FREE_VARS           1

  5           2 RESUME                   0

  6           4 LOAD_DEREF               0 (x)
              6 RETURN_VALUE

app.pyc.zip