MatthieuDartiailh / bytecode

Python module to modify bytecode
https://bytecode.readthedocs.io/
MIT License
302 stars 38 forks source link

EXTENDED_ARG + NOP Error #116

Closed EmmaJaneBonestell closed 1 year ago

EmmaJaneBonestell commented 1 year ago

When running Pynguin (https://github.com/se2p/pynguin) on certain programs (like https://github.com/bottlepy/bottle) , I am running into this error (https://github.com/python/cpython/issues/89918), but obviously arising from bytecode, not dis.

The trace and exact error changes betweem 0.13/0.14 of bytecode, but the issue stems from the same area.

You may reproduce it with the following modified example (from the above CPytho n error), either as a script or in a REPL:

from types import CodeType
from bytecode import Bytecode

constants = [None] * (0x000129 + 1)
constants[0x000129] = "Hello world!"

code = CodeType(
    0,  # argcount
    0,  # posonlyargcount
    0,  # kwonlyargcount
    0,  # nlocals
    1,  # stacksize
    64,  # flags
    bytes([
        0x90, 0x01,  # EXTENDED_ARG 0x01
        0x09, 0xFF,  # NOP 0xFF
        0x90, 0x01,  # EXTENDED_ARG 0x01
        0x64, 0x29,  # LOAD_CONST 0x29
        0x53, 0x00,  # RETURN_VALUE 0x00
    ]),  # codestring=
    tuple(constants),  # constants
    (),  # names
    (),  # varnames
    '<no file>',  # filename
    'code',  # name
    1,  # firstlineno
    b''  # linetable
)

print("Output:", eval(code))

print(list(Bytecode.from_code(code)))

On 0.14, this will give:

Output: Hello world!
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], line 32
      7 code = CodeType(
      8     0,  # argcount
      9     0,  # posonlyargcount
   (...)
     27     b''  # linetable
     28 )
     30 print("Output:", eval(code))
---> 32 print(list(Bytecode.from_code(code)))

File ~/.local/lib/python3.10/site-packages/bytecode/bytecode.py:282, in Bytecode.from_code(code, prune_caches, conserve_exception_block_stackdepth)
    276 @staticmethod
    277 def from_code(
    278     code: types.CodeType,
    279     prune_caches: bool = True,
    280     conserve_exception_block_stackdepth: bool = False,
    281 ) -> "Bytecode":
--> 282     concrete = _bytecode.ConcreteBytecode.from_code(code)
    283     return concrete.to_bytecode(
    284         prune_caches=prune_caches,
    285         conserve_exception_block_stackdepth=conserve_exception_block_stackdepth,
    286     )

File ~/.local/lib/python3.10/site-packages/bytecode/concrete.py:346, in ConcreteBytecode.from_code(code, extended_arg)
    339 # HINT : in some cases Python generate useless EXTENDED_ARG opcode
    340 # with a value of zero. Such opcodes do not increases the size of the
    341 # following opcode the way a normal EXTENDED_ARG does. As a
    342 # consequence, they need to be tracked manually as otherwise the
    343 # offsets in jump targets can end up being wrong.
    344 if not extended_arg:
    345     # The list is modified in place
--> 346     bytecode._remove_extended_args(instructions)
    348 bytecode.name = code.co_name
    349 bytecode.filename = code.co_filename

File ~/.local/lib/python3.10/site-packages/bytecode/concrete.py:749, in ConcreteBytecode._remove_extended_args(instructions)
    746 arg = (extended_arg << 8) + instr.arg
    747 extended_arg = None
--> 749 instr = ConcreteInstr(
    750     instr.name,
    751     arg,
    752     location=instr.location,
    753     extended_args=nb_extended_args,
    754 )
    755 instructions[index] = instr
    756 nb_extended_args = 0

File ~/.local/lib/python3.10/site-packages/bytecode/concrete.py:90, in ConcreteInstr.__init__(self, name, arg, lineno, location, extended_args)
     77 def __init__(
     78     self,
     79     name: str,
   (...)
     87     # Python to properly compute the size and avoid messing up the jump
     88     # targets
     89     self._extended_args = extended_args
---> 90     super().__init__(name, arg, lineno=lineno, location=location)

File ~/.local/lib/python3.10/site-packages/bytecode/instr.py:448, in BaseInstr.__init__(self, name, arg, lineno, location)
    440 def __init__(
    441     self,
    442     name: str,
   (...)
    446     location: Optional[InstrLocation] = None,
    447 ) -> None:
--> 448     self._set(name, arg)
    449     if location:
    450         self._location = location

File ~/.local/lib/python3.10/site-packages/bytecode/concrete.py:111, in ConcreteInstr._set(self, name, arg)
    106 def _set(
    107     self,
    108     name: str,
    109     arg: int,
    110 ) -> None:
--> 111     super()._set(name, arg)
    112     size = 2
    113     if arg is not UNSET:

File ~/.local/lib/python3.10/site-packages/bytecode/instr.py:645, in BaseInstr._set(self, name, arg)
    642 except KeyError:
    643     raise ValueError("invalid operation name")
--> 645 self._check_arg(name, opcode, arg)
    647 self._name = name
    648 self._opcode = opcode

File ~/.local/lib/python3.10/site-packages/bytecode/concrete.py:104, in ConcreteInstr._check_arg(self, name, opcode, arg)
    102 else:
    103     if arg is not UNSET:
--> 104         raise ValueError("operation %s has no argument" % name)

ValueError: operation NOP has no argument

For the reproducer at least, the following addition seems to have worked around the problem: https://github.com/MatthieuDartiailh/bytecode/blob/75948ede3a4ed70d28b3220d4399ac7516734f1e/src/bytecode/concrete.py#L741-L752

            if extended_arg is not None:
                if instr.name == "NOP":
                    arg = UNSET
                else:
                    arg = (extended_arg << 8) + instr.arg
                 instr = ConcreteInstr( 
                 ...

It would be nice if this could be backported to at least 0.13 as well, which would instead be lines 370 to 372 of concrete.py.

MatthieuDartiailh commented 1 year ago

Could you make a PR against main with the fix and a test ?

I do not usually make bugfix release for anything but the last version. Do you have a particular need for a bugfix on 0.13 ?

P403n1x87 commented 1 year ago

I do not usually make bugfix release for anything but the last version. Do you have a particular need for a bugfix on 0.13 ?

Those still using Python 3.6 or 3.7 won't be able to upgrade to a version with the bugfix

EmmaJaneBonestell commented 1 year ago

Could you make a PR against main with the fix and a test ?

I do not usually make bugfix release for anything but the last version. Do you have a particular need for a bugfix on 0.13 ?

Yes, I'll submit it within the next day.

And I requested it simply because Pynguin is currently dependent on version 0.13 and using 0.14 didn't appear to be a simple replacement, but I didn't look too closely.

Those still using Python 3.6 or 3.7 won't be able to upgrade to a version with the bugfix

That's true as well.