google / atheris

Apache License 2.0
1.38k stars 111 forks source link

Integrate Slipcover to Atheris #73

Open ligurio opened 11 months ago

ligurio commented 11 months ago

From the beginning, Atheris used ^1 sys.settrace-like instrumentation, same instrumentation used in Coverage.py (^2, ^3):

Atheris is a native Python extension, and is typically compiled with libFuzzer linked in. When you initialize Atheris, it registers a tracer with CPython to collect information about Python code flow. This tracer can keep track of every line reached and every function executed.

In commit e76f6375ec69f01b6794de779588b8567d5de943 sys.settrace has been replaced with bytecode instrumentation ^6.

There is a Python library SlipCover that tracks a Python program as it runs and reports on the parts that executed and those that didn't. SlipCover uses just-in-time instrumentation and de-instrumentation. It has proved coverage precise and near-zero overhead [^4] [^5].

I propose to reuse slipcover source code in Atheris.

[^5]: SlipCover: Near Zero-Overhead Code Coverage for Python -- Juan Altmayer Pizzorno, Emery D. Berger

AidenRHall commented 11 months ago

Thanks for the suggestion!

Although this library seems nice, using it would also seem to represent a major change to Atheris as a significant portion of our codebase is devoted to patching Python bytecode. Julian and I can consider this more seriously when we start planning for next year as it could be beneficial, however more analysis is required to show that it would be worth the effort.

Just out of curiosity, what led to this suggestion? I don't see your name on the UMass PLASMA website - are you somehow affiliated with this project via another way?

ligurio commented 11 months ago

Although this library seems nice, using it would also seem to represent a major change to Atheris as a significant portion of our codebase is devoted to patching Python bytecode. Julian and I can consider this more seriously when we start planning for next year as it could be beneficial, however more analysis is required to show that it would be worth the effort.

Thanks for explanation. Anyway, final decision is totally up to you. However, I believe it would be nice to reuse existed projects and join common efforts.

Just out of curiosity, what led to this suggestion? I don't see your name on the UMass PLASMA website - are you somehow affiliated with this project via another way?

I'm not affiliated with PLASMA group. I'm building my own fuzzing engine for another scripting language (Lua) and Atheris is one of the reference implementations which inspires me, another one is Jazzer.js.