project-everest / everparse

Automated generation of provably secure, zero-copy parsers from format specifications
https://project-everest.github.io/everparse
Apache License 2.0
251 stars 15 forks source link

Support for validating pointer-rich formats given a probe to check for pointer validity #118

Closed nikswamy closed 11 months ago

nikswamy commented 11 months ago

This PR provides support for a "probe and copy" feature, allowing input formats to contain pointers that will be followed by the validator after a user-provided "probe" function checks that the pointer refers to legal memory for a given extent.

The following example illustrates (from tests/Probe.3d)

extern probe ProbeAndCopy

typedef struct _T {
  UINT32 x { x >= 17 };
  UINT32 y { y >= x };
} T;

entrypoint
typedef struct _S(EVERPARSE_COPY_BUFFER_T dest) {
  UINT8 tag;
  T *t probe (length = 8, destination = dest);
} S;
extern BOOLEAN ProbeAndCopy(uint64_t pointer_value, uint64_t length, EVERPARSE_COPY_BUFFER_T destination);

This function should probe the pointer_value​, and if successful, it will copy length​ bytes into destination. You can use whatever existing (typically kernel) primitive you have for this and it should take care of the exception handling etc. and just return a boolean.

Disjointness preconditions

This required a change to the abstraction of validators and actions. Now, they are indexed by an additional "disjointness precondition", which is used ensure that when validating the contents of a copy buffer, that an action running during that process does not modify the copy buffer itself. For example, the disjointness precondition causes us to (rightfully) reject this specification---to validate S *s we copy into dest, but in the process of validating S we again copy into dest when encountering T *t. This is detected and rejected.

//Nested probing of the same buffer; should fail
entrypoint
typedef struct _R(EVERPARSE_COPY_BUFFER_T dest) {
  UINT8 tag;
  S(dest) *s probe (length = 9, destination = dest);
} R;

Compiler performance

The addition of the disjointness precondition adds an overhead to the verification of all 3D specs, even those that do not use copy buffers, since the abstraction forces us to compute the disjointness precondition anyway. On small specifications, this overhead does not seem to be observable significantly. However, on a large internal benchmark, I measured an end-to-end overhead of around a 20% slowdown in verification time. I am still contemplating ways to reduce that overhead, though at least some overhead seems unavoidable, since we do have to compute this disjointness precondition for soundness. Perhaps there's a way to avoid computing it for the common case in which probing is not used.