yeslogic / fathom

🚧 (Alpha stage software) A declarative data definition language for formally specifying binary data formats. 🚧
Apache License 2.0
259 stars 14 forks source link

Refinement types #258

Open brendanzab opened 3 years ago

brendanzab commented 3 years ago

Refinement types are an important part of what we hope to add to Fathom. These should allow us to avoid the overflow and out-of-bounds access errors that are a common source of bugs and security vulnerabilities in hand-written binary parsers. They allow you to refine types with predicates that can be automatically discharged by an external solver like Z3 or CVC4.

This is an example of an integer refined by a predicate:

{ x : Int | 0 <= x && x < 256 } : Type

We might want a format version of this, eg:

{ sfnt_version : U32Be | sfnt_version == 0x00010000 || sfnt_version == 'OTTO' } : Format

Previous, obsolete Issues: #63, #105, #123, #158

Resources

Examples of languages with support for this refinement types are:

Academic papers and reference implementations to learn from:

Some relevant Rust crates:

Challenges

We'll need some way to statically enforcing the subset of terms that can be sent to the solver. I'm thinking we could have another universe for this? Perhaps Smt or Pred? I haven't thought to carefully about the details though. F* uses effects for this but I'm hoping we can come up with a simpler solution.

It would also be nice to get a list of the lemmas we need to discharge at the end of elaboration. That way we could solve them in a way that is independent of a specific tool. Perhaps this could be returned using some standard format, like SMT-LIB 2?

brendanzab commented 1 year ago

This looks potentially related Compiling Higher-Order Specifications to SMT Solvers: How to Deal with Rejection Constructively. Seems to be about producing good errors when a predicate cannot be translated to an SMT solver?