MLton / mlton

The MLton repository
http://mlton.org
Other
952 stars 127 forks source link

Alternative defunctionalisation technique #581

Open hummy123 opened 1 month ago

hummy123 commented 1 month ago

Hi there,

I was implementing a trie/prefix tree data structure (here) and, while considering how to implement fold on the data structure, I think I noticed a conceptually simple defunctionalisation technique that doesn't require runtime dispatching on a datatype.

The kind of transformation I have in mind would turn the higher-order function Vector.foldl (for example) from:

Vector.foldl (fn (elt, acc) => elt + acc)

to:

signature VECTOR_FOLDL =
sig
  type vectorElementType 
  type environment
  type accumulator
  val folder : vectorElementType * environment * accumulator -> accumulator 
end

functor MakeVectorFoldl (Fn : VECTOR_FOLDL) =
struct
  fun helpFoldl (pos, env, acc, vec) =
    if pos = Vector.length vec
    then acc
    else
      let
        val elt = Vector.sub (vec, pos)
        val acc = Fn.folder (elt, env, acc)
      in
        helpFoldl (pos + 1, env, acc, vec)
      end

  fun foldl (env, initial, vec) = helpFoldl (0, env, initial, vec)
end

structure PlusFolder =
struct
  type vectorElementType = int
  type environment= unit
  type accumulator = int
  fun folder (elt, _, acc) = elt + acc
end

structure PlusVectorFoldl = MakeVectorFoldl (PlusFolder)

(* higher order function which I think will not have a runtime dispatch cost *)
val foldl = PlusVector.foldl 

so that MLton's defunctoriser in the elaborate pass doesn't need to construct a sun type over every higher order function and dispatch to the appropriate function at runtime.

I would like to have a go at implementing this kind of optimisation in MLton myself, but I might give up considering my total inexperience with coding compilers 😅 so I thought it wouldn't be bad to post it here in the event I get frustrated and fail, and there is interest from others in implementing the idea.

The transformation looks general to me (can mutate values in the environment record if user wants) and doesn't seem to require anything fundamentally novel to implement, but I would be interested in hearing criticism if it's not possible too.

YawarRaza7349 commented 1 month ago

http://mlton.org/Polyvariance

hummy123 commented 1 month ago

Oh. 😅 My bad; I missed the polyvariance pass in my research. Thanks for the link.

I'm a bit surprised though. This paper claims that specialised defunctionalisation is often faster, so I would have expected it to be better for the runtime dispatch to be entirely eliminated.

The specialised defunctionalisation technique in that paper doesn't completely eliminate runtime dispatches, but it does try to minimise the number of case variants, so that a higher order function doesn't accumulate case variants when used in different parts of the program.

Thanks again for the link! Something for me to play around with.

MatthewFluet commented 1 month ago

The Polyvariance pass will, essentially, eliminate the case dispatch, because there will be a single variant for the function argument to the duplicated function. We don't try to "force" the variant to be eliminated during defunctionalisation, but it will tend to be eliminated by subsequent SSA optimizations.

Unrestricted polyvariant duplication can lead to unacceptable code growth and I don't believe that you will be able to completely eliminate the need for some runtime dispatch (in Standard ML).

Rust's monomorphisation and treatment of the Fn-family of traits does essentially eliminate all runtime dispatch (when not using a dyn Fn), so effectively implements the technique described in the initial message of this issue. However, MLton doesn't monomorphise with respect to specific lambdas (the way that Rust does), so there are programs expressible in SML that aren't expressible in Rust (without resorting to dyn Fn).

The Lambda Set Specialization paper is on by "to read" shortlist, since it does offer some evidence that more aggressive specialization up front would be a performance win.