Why is including the offset beyond buffer in the bug id needed?

dms1lva commented 1 year ago

Hello,

I am trying to understand why including the offset beyond the buffer at which the access violation occurs is needed to produce the bug id. In practice, I am seeing that what seems to be the same bug, with a different offset beyond the buffer, is producing different bug ids.

From an exploit development perspective, I can see that it could be useful to have different PoCs of the same bug but with different buffer offsets. Are there other reasons?

Thanks!

SkyLined commented 1 year ago

Hey, sorry for the late reply.

You are correct: knowing that the offset is variable is useful when determining the exploitability but does reduce your ability to bucketize issues using the BugId.

In order to limit the number of bug ids for the same bug, but maintain some ability to detect this variability, I've added the mBugId.uArchitectureIndependentBugIdBits option. I often set it to 32 during fuzzing, so I get the same Bug Ids for crashes with a static offset in the 32-bit and 64-bit versions of an application, but also get a limited number of different BugIds if the offset can vary by multiples of 1 byte (0, 1, 2, 3, 4n, 4n+1, 4n+2, 4n+3)

See modules\mBugId\dxConfig.py for details. You can set this option on the command line using --mBugId.uArchitectureIndependentBugIdBits=32.

If you set it to 8, you get either the id with no offset if it's 0 or +n if its larger than zero.

dms1lva commented 1 year ago

Thanks for the answer! That definitely makes sense then.

SkyLined / BugId

Why is including the offset beyond buffer in the bug id needed? #109