ocaml-flambda / flambda-backend

The Flambda backend project for OCaml
105 stars 76 forks source link

Optimise certain pattern matching into equality #2271

Open riaqn opened 7 months ago

riaqn commented 7 months ago

Example:

type (_, _) eq =
  | Refl : ('a, 'a) eq

type 'a t =
  | String : string t
  | Int : int t
  | Float : float t

let eq: type a b. a t -> b t -> (a, b) eq option
= fun a b ->
  match a, b with
  | String, String -> Some Refl
  | Int, Int -> Some Refl
  | Float, Float -> Some Refl
  | _ -> None

The corresponding cmm with -O3:

$ _build/install/main/bin/ocamlopt.opt foo.ml -I _build/install/runtime_stdlib/lib/ocaml_runtime_stdlib/ -dcmm -c -O3
(data global "camlFoo__gc_roots": int 0)
(data int 1792 global "camlFoo": addr G:"camlFoo__eq_1")
(data
 int 4087
 global "camlFoo__eq_1":
 addr G:"caml_curry2"
 int 180143985094819847
 addr G:"camlFoo__eq_0_1_code")
(data
 int 4092
 "camlFoo__invalid58":
 string "unreachable switch case"
 skip 0
 byte 0)
(data int 1792 global "camlFoo__const_block14": int 1)
(function{foo.ml:9,4-181} camlFoo__eq_0_1_code (a/299: int b/300: int)
 (let switcher/305 (>>s a/299 1)
   (switch switcher/305 
   case 0: (if (!= b/300 1) 1 G:"camlFoo__const_block14")
   case 1: (if (!= b/300 3) 1 G:"camlFoo__const_block14")
   case 2: (if (>= b/300 5) G:"camlFoo__const_block14" 1))))

(function no_cse reduce_code_size linscan ignore zero_alloc camlFoo__entry 
 () (catch (exit 2 G:"camlFoo") with(2 *ret*/296: val) 1))

(data)

But I imagine this can be optimized into %equal which should translate to integer comparison, which should be much faster.

lthls commented 7 months ago

Would it be useful if we had a version that only works on boolean-like results ? Something like that:

type (_, _) bool_with_eq =
| False : ('a, 'b) bool_with_eq
| True : ('a, 'a) bool_with_eq

The names of the constructors do not really matter, what matters is that the function is semantically equivalent to the equality primitive, and not equality composed with something else.

That version might be doable with an attribute: we would run a check that the function with such an attribute is indeed equivalent to the equality function, and replace its body with the identity primitive (of the appropriate type). It is a bit more complicated for recursive versions, as you have to prove that the types are inductive, and currently most of the recursive types aren't because of let rec.

Also, be careful that "%equal" does not always have better performance than a hand-written function. Here in particular, since you take inputs of possibly different types you need to cast the inputs before calling (=), and that might prevent specialisation of the primitive. You would then end up with a C call instead of just a few comparisons.

Anyway, this looks like a good internship project, so unless it's urgent I will probably wait until I find someone to work on that.

riaqn commented 7 months ago

Would it be useful if we had a version that only works on boolean-like results?

That's good point.

Could you elaborate your point about recursive types? Do you mean the hand-written function would be different from %equal? Say you have a non-inductive value, I imagine both functions would be non-terminating?

Here in particular, since you take inputs of possibly different types you need to cast the inputs before calling (=)

Not sure if I understand - replacing manually written function with %equal happens in the middle end so shouldn't need casting?

lthls commented 7 months ago

Say you have a non-inductive value, I imagine both functions would be non-terminating?

No, "%equal" may or may not terminate on cyclic values. Typically two physically equal structures are assumed to be structurally equal, so "%equal" might terminate on cyclic structures, and in some contexts this might be unsound.

Not sure if I understand - replacing manually written function with %equal happens in the middle end so shouldn't need casting?

I was referring to an implementation in source code using Obj here. I agree that if it's implemented in the compiler it's fine, although it's still a bit annoying to recover the right equality primitive in the middle-end (not enough type information).