python / cpython

The Python programming language
https://www.python.org
Other
63.13k stars 30.22k forks source link

compiler emits EXTENDED_ARG + NOP sequence in 3.10 #89918

Closed 1ca87b87-603e-4ad4-aa1d-118638373bd3 closed 1 month ago

1ca87b87-603e-4ad4-aa1d-118638373bd3 commented 2 years ago
BPO 45757
Nosy @markshannon, @serhiy-storchaka, @miss-islington, @brandtbucher, @iritkatriel, @rokm
PRs
  • python/cpython#29480
  • python/cpython#29502
  • python/cpython#29506
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library', '3.10', '3.11'] title = 'compiler emits EXTENDED_ARG + NOP sequence in 3.10' updated_at = user = 'https://github.com/rokm' ``` bugs.python.org fields: ```python activity = actor = 'brandtbucher' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'rok.mandeljc' dependencies = [] files = [] hgrepos = [] issue_num = 45757 keywords = ['patch'] message_count = 14.0 messages = ['405985', '405986', '405988', '405989', '405991', '405999', '406031', '406047', '406048', '406056', '406057', '406058', '406060', '406130'] nosy_count = 6.0 nosy_names = ['Mark.Shannon', 'serhiy.storchaka', 'miss-islington', 'brandtbucher', 'iritkatriel', 'rok.mandeljc'] pr_nums = ['29480', '29502', '29506'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue45757' versions = ['Python 3.10', 'Python 3.11'] ```

    1ca87b87-603e-4ad4-aa1d-118638373bd3 commented 2 years ago

    dis module incorrectly handles the instruction sequence where EXTENDED_ARG is followed by NOP, e.g.,:

    EXTENDED_ARG 0x01 NOP EXTENDED_ARG 0x01 LOAD_CONST 0x29

    The above sequence loads the constant from index 0x0129, whereas dis attempts to load it from index 0x010129, resulting in "IndexError: tuple index out of range error" when attempting to iterate over instructions returned by dis.get_instructions().

    Complete example:

    # *** example.py begin ***
    import types
    import dis
    
    # Create a test code object...
    constants = [None] * (0x000129 + 1)
    constants[0x000129] = "Hello world!"
    
    code = types.CodeType(
        0,  # argcount
        0,  # posonlyargcount
        0,  # kwonlyargcount
        0,  # nlocals
        1,  # stacksize
        64,  # flags
        bytes([
            0x90, 0x01,  # EXTENDED_ARG 0x01
            0x09, 0xFF,  # NOP 0xFF
            0x90, 0x01,  # EXTENDED_ARG 0x01
            0x64, 0x29,  # LOAD_CONST 0x29
            0x53, 0x00,  # RETURN_VALUE 0x00
        ]),  # codestring=
        tuple(constants),  # constants
        (),  # names
        (),  # varnames
        '<no file>',  # filename
        'code',  # name
        1,  # firstlineno
        b''  # linetable
    )
    
    # ... and eval it to show that NOP resets EXTENDED_ARG
    print("Output:", eval(code))
    
    # Try to list all instructions via dis
    print(list(dis.get_instructions(code)))

    # ** example.py end **\

    Running the example gives us:

    Output: Hello world!
    Traceback (most recent call last):
      File "/home/rok/example.py", line 35, in <module>
        print(list(dis.get_instructions(code)))
      File "/usr/lib64/python3.10/dis.py", line 338, in _get_instructions_bytes
        argval, argrepr = _get_const_info(arg, constants)
      File "/usr/lib64/python3.10/dis.py", line 292, in _get_const_info
        argval = const_list[const_index]
    IndexError: tuple index out of range

    To fix the problem on dis side, the extended_arg in dis._unpack_opargs should be reset to 0 when NOP (or perhaps any opcode without argument) is encountered. I.e., by adding "extended_arg = 0" here: https://github.com/python/cpython/blob/91275207296c39e495fe118019a757c4ddefede8/Lib/dis.py#L525

    The problematic behavior was reported by users of PyInstaller under python 3.10; it seems that python 3.10 generates bytecode involving EXTENDED_ARG + NOP for telegram.message.py: https://raw.githubusercontent.com/python-telegram-bot/python-telegram-bot/v13.7/telegram/message.py

    The original PyInstaller issue with preliminary analysis is here: https://github.com/pyinstaller/pyinstaller/issues/6301

    iritkatriel commented 2 years ago

    Thanks for the report. This should fix it, I'll create a patch:

    @@ -428,6 +428,7 @@ def _unpack_opargs(code):
                 extended_arg = (arg << 8) if op == EXTENDED_ARG else 0
             else:
                 arg = None
    +            extended_arg = 0
             yield (i, op, arg)
    iritkatriel commented 2 years ago

    + Mark for FYI.

    The behaviour of dis is not new, I wonder if generating code with EXTENDED_ARG + NOP is new?

    1ca87b87-603e-4ad4-aa1d-118638373bd3 commented 2 years ago

    The EXTENDED_ARG + NOP sequence seems to be specific to python 3.10; we could not reproduce the original bug (dis failing on telegram.message) with earlier python versions. The combination makes little sense, so perhaps it is a bug on its own?

    brandtbucher commented 2 years ago

    I have a hunch that this is caused in optimize_basic_block. In general, we don't bother clearing the oparg when peepholing instructions into NOPs.

    We could either start becoming more principled about that and fix all of the existing NOPs, or just update instrsize and write_op_arg to ignore args for instructions that don't use them.

    The latter seems easier and less error-prone.

    iritkatriel commented 2 years ago

    We should also verify that the interpreter doesn’t have its own version of the dis bug.

    brandtbucher commented 2 years ago

    I don't think that it does, since oparg gets totally reassigned each time we load a new instruction. EXTENDED_ARG actually needs to hack around this by advancing the instruction itself, updating oparg, and jumping straight into the next opcode.

    iritkatriel commented 2 years ago

    New changeset cb414cf0e207668300c4fe3f310c0bd249153273 by Irit Katriel in branch 'main': bpo-45757: Fix bug where dis produced an incorrect oparg on EXTENDED_ARG before a no-arg opcode (GH-29480) https://github.com/python/cpython/commit/cb414cf0e207668300c4fe3f310c0bd249153273

    iritkatriel commented 2 years ago

    The 3.9 backport had a conflict, but I think it's not worth worrying about because this scenario only showed up in 3.10.

    iritkatriel commented 2 years ago

    New changeset c5bfb88eb6f82111bb1603ae9d78d0476b552d66 by Irit Katriel in branch '3.10': [3.10] bpo-45757: Fix bug where dis produced an incorrect oparg on EXTENDED_ARG before a no-arg opcode (GH-29480) (GH-29506) https://github.com/python/cpython/commit/c5bfb88eb6f82111bb1603ae9d78d0476b552d66

    iritkatriel commented 2 years ago

    I don't think that it does, since oparg gets totally reassigned each time we load a new instruction. EXTENDED_ARG actually needs to hack around this by advancing the instruction itself, updating oparg, and jumping straight into the next opcode.

    Right, it's like this:

            TARGET(EXTENDED_ARG) {
                int oldoparg = oparg;
                NEXTOPARG();
                oparg |= oldoparg << 8;
                PRE_DISPATCH_GOTO();
                DISPATCH_GOTO();
            }

    It's seems correct (the next arg will be NOP so nothing happens). But it's wasteful.

    brandtbucher commented 2 years ago

    Indeed. Do you plan on removing the extra EXTENDED_ARGs in instrsize and write_op_arg? I can take care of it if not.

    iritkatriel commented 2 years ago

    That sounds like a good idea. Please go ahead.

    brandtbucher commented 2 years ago

    Here's a (more) minimal reproducer I've been able to create:

    # First, "use up" 256 unique constants:
    spam=0x00;spam=0x01;spam=0x02;spam=0x03;spam=0x04;spam=0x05;spam=0x06;spam=0x07;
    spam=0x08;spam=0x09;spam=0x0a;spam=0x0b;spam=0x0c;spam=0x0d;spam=0x0e;spam=0x0f;
    spam=0x10;spam=0x11;spam=0x12;spam=0x13;spam=0x14;spam=0x15;spam=0x16;spam=0x17;
    spam=0x18;spam=0x19;spam=0x1a;spam=0x1b;spam=0x1c;spam=0x1d;spam=0x1e;spam=0x1f;
    spam=0x20;spam=0x21;spam=0x22;spam=0x23;spam=0x24;spam=0x25;spam=0x26;spam=0x27;
    spam=0x28;spam=0x29;spam=0x2a;spam=0x2b;spam=0x2c;spam=0x2d;spam=0x2e;spam=0x2f;
    spam=0x30;spam=0x31;spam=0x32;spam=0x33;spam=0x34;spam=0x35;spam=0x36;spam=0x37;
    spam=0x38;spam=0x39;spam=0x3a;spam=0x3b;spam=0x3c;spam=0x3d;spam=0x3e;spam=0x3f;
    spam=0x40;spam=0x41;spam=0x42;spam=0x43;spam=0x44;spam=0x45;spam=0x46;spam=0x47;
    spam=0x48;spam=0x49;spam=0x4a;spam=0x4b;spam=0x4c;spam=0x4d;spam=0x4e;spam=0x4f;
    spam=0x50;spam=0x51;spam=0x52;spam=0x53;spam=0x54;spam=0x55;spam=0x56;spam=0x57;
    spam=0x58;spam=0x59;spam=0x5a;spam=0x5b;spam=0x5c;spam=0x5d;spam=0x5e;spam=0x5f;
    spam=0x60;spam=0x61;spam=0x62;spam=0x63;spam=0x64;spam=0x65;spam=0x66;spam=0x67;
    spam=0x68;spam=0x69;spam=0x6a;spam=0x6b;spam=0x6c;spam=0x6d;spam=0x6e;spam=0x6f;
    spam=0x70;spam=0x71;spam=0x72;spam=0x73;spam=0x74;spam=0x75;spam=0x76;spam=0x77;
    spam=0x78;spam=0x79;spam=0x7a;spam=0x7b;spam=0x7c;spam=0x7d;spam=0x7e;spam=0x7f;
    spam=0x80;spam=0x81;spam=0x82;spam=0x83;spam=0x84;spam=0x85;spam=0x86;spam=0x87;
    spam=0x88;spam=0x89;spam=0x8a;spam=0x8b;spam=0x8c;spam=0x8d;spam=0x8e;spam=0x8f;
    spam=0x90;spam=0x91;spam=0x92;spam=0x93;spam=0x94;spam=0x95;spam=0x96;spam=0x97;
    spam=0x98;spam=0x99;spam=0x9a;spam=0x9b;spam=0x9c;spam=0x9d;spam=0x9e;spam=0x9f;
    spam=0xa0;spam=0xa1;spam=0xa2;spam=0xa3;spam=0xa4;spam=0xa5;spam=0xa6;spam=0xa7;
    spam=0xa8;spam=0xa9;spam=0xaa;spam=0xab;spam=0xac;spam=0xad;spam=0xae;spam=0xaf;
    spam=0xb0;spam=0xb1;spam=0xb2;spam=0xb3;spam=0xb4;spam=0xb5;spam=0xb6;spam=0xb7;
    spam=0xb8;spam=0xb9;spam=0xba;spam=0xbb;spam=0xbc;spam=0xbd;spam=0xbe;spam=0xbf;
    spam=0xc0;spam=0xc1;spam=0xc2;spam=0xc3;spam=0xc4;spam=0xc5;spam=0xc6;spam=0xc7;
    spam=0xc8;spam=0xc9;spam=0xca;spam=0xcb;spam=0xcc;spam=0xcd;spam=0xce;spam=0xcf;
    spam=0xd0;spam=0xd1;spam=0xd2;spam=0xd3;spam=0xd4;spam=0xd5;spam=0xd6;spam=0xd7;
    spam=0xd8;spam=0xd9;spam=0xda;spam=0xdb;spam=0xdc;spam=0xdd;spam=0xde;spam=0xdf;
    spam=0xe0;spam=0xe1;spam=0xe2;spam=0xe3;spam=0xe4;spam=0xe5;spam=0xe6;spam=0xe7;
    spam=0xe8;spam=0xe9;spam=0xea;spam=0xeb;spam=0xec;spam=0xed;spam=0xee;spam=0xef;
    spam=0xf0;spam=0xf1;spam=0xf2;spam=0xf3;spam=0xf4;spam=0xf5;spam=0xf6;spam=0xf7;
    spam=0xf8;spam=0xf9;spam=0xfa;spam=0xfb;spam=0xfc;spam=0xfd;spam=0xfe;spam=0xff;
    # Then, define a function with at least two default positional arguments that:
    # - are all constants
    # - span multiple lines
    # - include a previously unused constant value
    def foo(
        bar=0x100,  # <-- This becomes EXTENDED_ARG(1) + NOP(0).
        baz=0x00,
    ): pass
    # The peephole pass will fold the constant default value tuple construction for
    # the function definition, but leave behind NOPs that cannot be removed (because
    # it would break tracing guarantees). One of these NOPs will still have a
    # "large" oparg.

    It's not really that hard to create unused arguments... it's just hard to make them large enough to need an EXTENDED_ARG! For example, a, b = b, a creates a ROT_TWO with an unused argument.

    Given the pretty tight coupling (and speed/simplicity) of instrsize and write_op_arg, I now think the best fix would instead just be adding another pass immediately before assemble_jump_offsets that does something like this:

    static void
    fix_opargs(struct assembler *a)
    { 
        for (basicblock *b = a->a_entry; b != NULL; b = b->b_next) {
            for (int i = 0; i < b->b_iused; i++) {
                if (b->b_instr[i].i_opcode < HAVE_ARGUMENT) {
                    // Opcodes with unused arguments may have been introduced during
                    // the optimization process:
                    b->b_instr[i].i_oparg = 0;
                }
            }
        }
    }

    Even though it's quite straightforward to fix, I only really think it's worth doing if it annoys us enough that many non-argument-using instructions are indeed emitted with nonzero arguments. I'm starting to feel like it will only complicate things and possibly slow down compilation times (granted, probably by a very small amount, but it's not nothing). Besides the above example, (which was informed by the linked PyInstaller discussion, thanks Rok), I haven't actually been able to coax the compiler into emitting these extra EXTENDED_ARGs in *any* other situation.

    TL;DR: This is super rare, doesn't actually cause any problems, and incurs a small compile-time cost to "fix".

    Close as "won't fix"?