deriving different equality functions for the same type

ocaml-ppx / ppx_deriving

Type-driven code generation for OCaml

MIT License

466 stars 89 forks source link

deriving different equality functions for the same type #249

Open mjambon opened 3 years ago

mjambon commented 3 years ago

We have a large AST definition made of many node types and we'd like to derive two different sets of equality functions:

one using structural equality, ignoring node IDs
one using only the node IDs

Our code looks like this:

type stmt = {
  id : int [@deriving equal fun _ _ -> true]; (* ignore if generated function prefix is "equal_" *)
  value : stmt_value [@deriving equal2 fun _ _ -> true]; (* ignore if prefix is "equal2_" *)
}
and stmt_value = ...
and expr = ...
...
[@@deriving eq, eq2] (* "eq2" is hypothetical, generates functions with prefix "equal2_" *)

We want one set of functions with the equal_ prefix and another set with the equal2_ prefix (or better, a custom name).

Is there currently a way to do such thing? Any suggestion on how to make this work?

gasche commented 3 years ago

Several half-baked ideas:

If you have a map function on your type, one way to compare while ignoring ids is to "normalize" the data by mapping all ids to a dummy value. (This is probably very costly, though.)
[@equal fun a b -> if !compare_by_id then ... else ...] might work if the two different comparisons never call each other (so it works set mutate the flag for the duration of a call).
ppx_import lets you re-import a definition and add new deriving attributes, I think, but given the size of your declarations this would be very painful.
Personally I think that this rather a task for a more flexibility-oriented approach, such as visitors. (But I don't know how much support visitors has for parallel/binary visitors, visiting two structures in sync'.

mjambon commented 3 years ago

Thanks @gasche! I'll ruminate this. The trick using a ref (2) is probably the simplest and could be made foolproof by setting a flag that indicates whether a comparison is in progress. Something like this:

type busy_with_equal = Not_busy | Structural_equal | Referential_equal
let busy_with_equal = ref Not_busy

let equal_id a b =
  match !busy_with_equal with
  | Not_busy -> assert false
  | Structural_equal -> true
  | Referential_equal -> a = b

let equal_stmt_value_ref = ref (fun a b -> failwith "not initialized")

let equal_value a b =
  match !busy_with_equal with
  | Not_busy -> assert false
  | Structural_equal -> !equal_stmt_value_ref a b
  | Referential_equal -> true

type stmt = {
  id : int [@equal equal_id];
  value : stmt_value [@equal equal_value];
}
and expr = ...
...

let () = equal_stmt_value_ref := equal_stmt_value

let with_structural_equal equal a b =
  Fun.protect
    ~finally:(fun () -> busy_with_equal := Not_busy)
    (fun () ->
       match !busy_with_equal with
       | Not_busy ->
           busy_with_equal := Structural_equal;
           equal a b
       | Structural_equal
       | Referential_equal ->
           failwith "an equal is already in progress"
    )

let with_referential_equal equal a b =
  Fun.protect
    ~finally:(fun () -> busy_with_equal := Not_busy)
    (fun () ->
       match !busy_with_equal with
       | Not_busy ->
           busy_with_equal := Referential_equal;
           equal a b
       | Structural_equal
       | Referential_equal ->
           failwith "an equal is already in progress"
    )

gasche commented 3 years ago

I think there is a minor bug in your code: if you failwith because an equality is in progress, the ~finally block will unset the work status, causing the ongoing equality tests (assuming we are in a scenario where calls indeed interleave) to assert false. It would be more correct with Fun.protect only in the branch where you set the reference.

mjambon commented 3 years ago

nice catch, thank you.

For the record, this is the corrected code (still untested):

let with_structural_equal equal a b =
  match !busy_with_equal with
  | Not_busy ->
      busy_with_equal := Structural_equal;
      Fun.protect
        ~finally:(fun () -> busy_with_equal := Not_busy)
        (fun () -> equal a b)
  | Structural_equal
  | Referential_equal ->
      failwith "an equal is already in progress"

(* + similar fix for with_referential_equal *)

mjambon commented 3 years ago

Also, the following doesn't work (type stmt_value escapes its scope):

let equal_stmt_value_ref = ref (fun a b -> failwith "not initialized")

let equal_value a b =
  match !busy_with_equal with
  | Not_busy -> assert false
  | Structural_equal -> !equal_stmt_value_ref a b
  | Referential_equal -> true

This is due to generated code being of the form:

let rec equal_stmt =
  let _0 () = ... in
  fun a b -> ...

When one needs to depend on one of the generated functions in the same let rec group, a solution is to pass the required function as an argument, so as to avoid the use of a ref:

let equal_value equal_stmt_value a b =
  match !busy_with_equal with
  | Not_busy -> assert false
  | Structural_equal -> equal_stmt_value a b
  | Referential_equal -> true

and the ppx annotation becomes this:

type stmt = {
  id : int [@equal equal_id];
  value : stmt_value [@equal (equal_value equal_stmt_value)]; (* parens for clarity only *)
}