zrax / pycdc

C++ python bytecode disassembler and decompiler
GNU General Public License v3.0
2.92k stars 584 forks source link

Unsupported node type 48 in NODE_JOINEDSTR #438

Closed Puyodead1 closed 3 months ago

Puyodead1 commented 3 months ago

Getting this with python 3.9 a.zip (github doesnt allow uploading the regular .pyc)

greenozon commented 3 months ago

try this result, not idea but still.. a-issue438.cdc2.txt

ddouworld commented 3 months ago

I think the two bytecode JUMP_IF_NOT_EXC_MATCH_A and RERAISE were not implemented. If you add this manually and don't do anything in it(in ASTree.cpp), you can make the output look a little bit more complete, right? 19a9e014268405a9d3d81254dac58d0d

greenozon commented 3 months ago

hmm, good start! but why case 99? each and every opcode should have its literal name, isnt' it?

ddouworld commented 3 months ago

Yes, you're right, but I'm just doing a test, so it's not that rigorous. 99 can be replaced with Pyc::RERAISE

greenozon commented 3 months ago

yeah, but you might hit a tricky case when in different pythons same opcode value might mean different functionality

ddouworld commented 3 months ago

Yes, this is a very tricky problem, I tried to fix it yesterday on python3.9. The test case I used looked like this

try:
    print("test")
except:
    pass

It works fine on python3.7, and the bytecode looks like this

        0       SETUP_EXCEPT                    12 (to 14)
        2       LOAD_NAME                       0: print
        4       LOAD_CONST                      0: 'test'
        6       CALL_FUNCTION                   1
        8       POP_TOP
        10      POP_BLOCK
        12      JUMP_FORWARD                    12 (to 26)
        14      POP_TOP
        16      POP_TOP
        18      POP_TOP
        20      POP_EXCEPT
        22      JUMP_FORWARD                    2 (to 26)
        24      END_FINALLY
        26      LOAD_CONST                      1: None
        28      RETURN_VALUE

However, there are problems with python3.9 python3.9's bytecode looks like this

        0       SETUP_FINALLY                   12 (to 14)
        2       LOAD_NAME                       0: print
        4       LOAD_CONST                      0: 'test'
        6       CALL_FUNCTION                   1
        8       POP_TOP
        10      POP_BLOCK
        12      JUMP_FORWARD                    12 (to 26)
        14      POP_TOP
        16      POP_TOP
        18      POP_TOP
        20      POP_EXCEPT
        22      JUMP_FORWARD                    2 (to 26)
        24      RERAISE
        26      LOAD_CONST                      1: None
        28      RETURN_VALUE

I modified ASTree.cpp. Not much code was changed. Like this,

    case Pyc::SETUP_FINALLY_A:
    {
        if (mod->verCompare(3, 9) < 0) {
            PycRef<ASTBlock> next = new ASTContainerBlock(pos + operand);
            blocks.push(next.cast<ASTBlock>());
            curblock = blocks.top();

            need_try = true;
            break;
        }
    }
    case Pyc::SETUP_EXCEPT_A:
    {
        if (curblock->blktype() == ASTBlock::BLK_CONTAINER) {
            curblock.cast<ASTContainerBlock>()->setExcept(pos + operand);
        }
        else {
            PycRef<ASTBlock> next = new ASTContainerBlock(0, pos + operand);
            blocks.push(next.cast<ASTBlock>());
        }

        /* Store the current stack for the except/finally statement(s) */
        stack_hist.push(stack);
        PycRef<ASTBlock> tryblock = new ASTBlock(ASTBlock::BLK_TRY, pos + operand, true);
        blocks.push(tryblock.cast<ASTBlock>());
        curblock = blocks.top();

        need_try = false;
    }
    break;c 

I put RERAISE and END_FINALLY in the same case。 On python3.9, the decompilation succeeded But there are still a lot of questions,

try:
    print("test")
except Exception as e:
    pass

This code decompilation will fail I commit the code to my repository. You can check it out.

greenozon commented 3 months ago

Good stuff! funny though in python 3.12 it's even more different:

    [Disassembly]
        0       RESUME                        0
        2       NOP
        4       PUSH_NULL
        6       LOAD_NAME                     0: print
        8       LOAD_CONST                    0: 'test'
        10      CALL                          1
        18      POP_TOP
        20      RETURN_CONST                  1: None
        22      PUSH_EXC_INFO
        24      POP_TOP
        26      POP_EXCEPT
        28      RETURN_CONST                  1: None
        30      COPY                          3
        32      POP_EXCEPT
        34      RERAISE                       1

and this one for as e edition:

    [Disassembly]
        0       RESUME                        0
        2       NOP
        4       PUSH_NULL
        6       LOAD_NAME                     0: print
        8       LOAD_CONST                    0: 'test'
        10      CALL                          1
        18      POP_TOP
        20      RETURN_CONST                  1: None
        22      PUSH_EXC_INFO
        24      LOAD_NAME                     1: Exception
        26      CHECK_EXC_MATCH
        28      POP_JUMP_IF_FALSE             10 (to 50)
        30      STORE_NAME                    2: e
        32      POP_EXCEPT
        34      LOAD_CONST                    1: None
        36      STORE_NAME                    2: e
        38      DELETE_NAME                   2: e
        40      RETURN_CONST                  1: None
        42      LOAD_CONST                    1: None
        44      STORE_NAME                    2: e
        46      DELETE_NAME                   2: e
        48      RERAISE                       1
        50      RERAISE                       0
        52      COPY                          3
        54      POP_EXCEPT
        56      RERAISE                       1
ddouworld commented 3 months ago

Yes, the bytecode varies a lot. It became a hassle to fix

zrax commented 3 months ago

Duplicate #450