rust-formal-methods / wg

22 stars 2 forks source link

non-derived hash and partialeq impls #2

Open gilescope opened 2 years ago

gilescope commented 2 years ago

I got thinking based on this clippy lint https://rust-lang.github.io/rust-clippy/master/#derive_hash_xor_eq about whether formal methods might be able to have a tool that checks that the hash implementations and the equality implementations meet a hashmap's requirements.

This is a common and insidious problem in computing, and while proving all implementations is probably not possible, I have a hunch that we can automatically prove that the vast majority of these implementations do maintain the required invariants because it's a very constrained problem.

Is there prior art on trying to tackle this problem? I could see this adding a lot of value if we could solve this problem. At the moment when we impl PartialEq and Hash we're saying trust the programmer, but it would be cool to verify.

vakaras commented 2 years ago

Most verification tools focus on checking properties that tell something about a single execution. An example of such property would be “the function max returns one the values it received as an argument”, which can be written in Prusti as follows:

#[ensures(result == a || result == b)]
fn max(a: i32, b: i32) -> i32 {
  if a > b { a } else { b }
}

The specifications of PartialEq, PartialOrd, and Hash relate multiple executions. Such properties are called hyper-properties. Verifying hyper-properties in a general case is much harder than normal properties (if there is interest, I could ask my colleagues who are working on this to suggest some reading on this topic). However, in the case of PartialEq, PartialOrd, and Hash we could probably assume that the methods are deterministic and without side effects (in other words, could be annotated with Rust's const or Prusti's #[pure]), which gives hope that we could stay within a fragment reachable to automated techniques.

Currently, Prusti does not have dedicated syntax for specifying properties relating multiple executions. @xldenis Maybe, this would be an interesting example for the specification challenge?

xldenis commented 2 years ago

Currently, Prusti does not have dedicated syntax for specifying properties relating multiple executions. @xldenis Maybe, this would be an interesting example for the specification challenge?

Definitely, if you can boil down an example I think this would be great.

However, I don't understand how ensuring that Hash and PartialEq are compatible requires relating multiple executions? As stated in the linked docs this just requires proving a theorem about Hash and PartialEq, you could imagine a trait WellBehaved with the following Creusot style specification:

trait WellBehaved : Hash + Eq {
  law! { forall<i, j> i == j -> i.hash() == j.hash() }
}

However, I have no idea how you would go about proving this in practice, seems like a challenging and tedious proof to do.

vakaras commented 2 years ago

Currently, Prusti does not have dedicated syntax for specifying properties relating multiple executions. @xldenis Maybe, this would be an interesting example for the specification challenge?

Definitely, if you can boil down an example I think this would be great.

I think the simplest example would be how one would specify PartialEq (I would leave the generic case for some later time). Instead of your law!, an alternative would be to use #[invariant(...)] on the trait.

#[invariant(
    (implements(A, PartialEq<B>) && implements(B, PartialEq<A>)) ==>
    forall(|x: A, y: B| x.eq(y) ==> y.eq(x))
)]
#[invariant(
    (implements(A, PartialEq<B>) && implements(B, PartialEq<C>) && implements(A, PartialEq<C>)) ==>
    forall(|x: A, y: B, z: C| (x.eq(y) && y.eq(z)) ==> x.eq(z))
)]
pub trait PartialEq<Rhs = Self> where Rhs: ?Sized, {
    fn eq(&self, other: &Rhs) -> bool;
}

However, I don't understand how ensuring that Hash and PartialEq are compatible requires relating multiple executions? As stated in the linked docs this just requires proving a theorem about Hash and PartialEq, you could imagine a trait WellBehaved with the following Creusot style specification:

trait WellBehaved : Hash + Eq {
  law! { forall<i, j> i == j -> i.hash() == j.hash() }
}

I think that in this case, your specification is sufficient. However, correct me if I am wrong, but it assumes that equality and hash functions are pure (deterministic and without any side effects).

However, I have no idea how you would go about proving this in practice, seems like a challenging and tedious proof to do.

I think that in many cases unrolling the definitions and throwing into an SMT solver should be sufficient. (Of course, you need to remove things like loops and handle non-linear arithmetic.)

xldenis commented 2 years ago

You should keep that specification of PartialEq for the RFMIG meeting! That implements predicate is interesting!

I think that in this case, your specification is sufficient. However, correct me if I am wrong, but it assumes that equality and hash functions are pure (deterministic and without any side effects).

Yea, what went unstated in my example is that I would assume that Eq and Hash require implementors to define a logical equality / hash which they realize through the implementation, and that's what we're seeing in the specifications.

Something like (yes I know this isn't the real rust trait):

trait Eq {
  logic! { fn equiv_relation(self, other: Self) -> bool }

  #[ensures(result <-> equiv_relation(self, other))]
  fn eq(&self, other : Self) -> bool;
}
gilescope commented 2 years ago

I'm probably jumping the gun here, but pragmatically the less non-rust dependencies in the solution, the more places people would be able to use the check. (E.g. if it was pure rust, there's zero adoption barriers and it could be integrated into clippy and in widespread use.)

vakaras commented 2 years ago

I'm probably jumping the gun here, but pragmatically the less non-rust dependencies in the solution, the more places people would be able to use the check. (E.g. if it was pure rust, there's zero adoption barriers and it could be integrated into clippy and in widespread use.)

I agree with you. However, for properties like these, I am pretty sure you will need a verifier with an automatic theorem prover.