google / atheris

Apache License 2.0
1.38k stars 111 forks source link

Instrumentation fails for the "calendar" module from the standard library #30

Closed cuu508 closed 2 years ago

cuu508 commented 2 years ago

Code:

import atheris

with atheris.instrument_imports():
    import calendar

Output:

INFO: Instrumenting calendar
Traceback (most recent call last):
  File "/home/cepe/repos/croniter-fuzzing/atheris_test.py", line 4, in <module>
    import calendar
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 879, in exec_module
  File "/home/cepe/venvs/croniter-fuzzing/lib/python3.10/site-packages/atheris/import_hook.py", line 196, in get_code
    return patch_code(code, self._trace_dataflow)
  File "/home/cepe/venvs/croniter-fuzzing/lib/python3.10/site-packages/atheris/instrument_bytecode.py", line 756, in patch_code
    inst.consts[i] = patch_code(inst.consts[i], trace_dataflow, nested=True)
  File "/home/cepe/venvs/croniter-fuzzing/lib/python3.10/site-packages/atheris/instrument_bytecode.py", line 758, in patch_code
    return inst.to_code()
  File "/home/cepe/venvs/croniter-fuzzing/lib/python3.10/site-packages/atheris/instrument_bytecode.py", line 457, in to_code
    self._check_state()
  File "/home/cepe/venvs/croniter-fuzzing/lib/python3.10/site-packages/atheris/instrument_bytecode.py", line 372, in _check_state
    listing[i].check_state()
  File "/home/cepe/venvs/croniter-fuzzing/lib/python3.10/site-packages/atheris/instrument_bytecode.py", line 208, in check_state
    assert jump_arg_bytes(self.arg) == self.reference
AssertionError
TheShiftedBit commented 2 years ago

Oh, thanks for the heads-up, we'll look into it.

TheShiftedBit commented 2 years ago

Looks like this is specific to the 3.10 upgrade

TheShiftedBit commented 2 years ago

I believe I've identified the issue. After instrumenting code, Atheris needs to fix-up instruction offsets. In one code path in adjust(), the offsets are adjusted like this:

          self.reference += size  # type: ignore[operator]
          self.arg += size

Calling self.arg += size in 3.10 is incorrect, because the argument to jump instructions changed from a byte offset to an instruction offset. The correct code is:

          self.reference += size  # type: ignore[operator]
          self.arg = add_bytes_to_jump_arg(self.arg, size)

This wasn't detected during testing because this particular code path is only triggered in the unlikely event that (1) the argument is an offset for an absolute jump instruction, and (2) the argument was previously <256 but is now >=256.

Here's a minimum reproducer of this issue:

import atheris

@atheris.instrument_func
def _ord2ymd(n):
    n -= 1
    n400, n = divmod(n, 1)
    year = n400 * 400 + 1

    n100, n = divmod(n, 1)

    n4, n = divmod(n, 1)

    n1, n = divmod(n, 1)

    leapyear = n1 == 3 and (n4 != 24 or n100 == 3)
    assert leapyear == baz(year)
    month = (n + 50) >> 5
    preceding = baz[month] + (month > 2 and leapyear)
    if preceding > n:
        month -= 1
        preceding -= bar[month] + (month == 2 and leapyear)
    n -= preceding
    assert 0 <= n < foo()
TheShiftedBit commented 2 years ago

We'll validate that this fix is indeed correct, and if so cut a release, most likely tomorrow.

TheShiftedBit commented 2 years ago

Issue fixed and released as 2.0.11.