Open hummy123 opened 1 month ago
Oh. 😅 My bad; I missed the polyvariance pass in my research. Thanks for the link.
I'm a bit surprised though. This paper claims that specialised defunctionalisation is often faster, so I would have expected it to be better for the runtime dispatch to be entirely eliminated.
The specialised defunctionalisation technique in that paper doesn't completely eliminate runtime dispatches, but it does try to minimise the number of case variants, so that a higher order function doesn't accumulate case variants when used in different parts of the program.
Thanks again for the link! Something for me to play around with.
The Polyvariance
pass will, essentially, eliminate the case dispatch, because there will be a single variant for the function argument to the duplicated function. We don't try to "force" the variant to be eliminated during defunctionalisation, but it will tend to be eliminated by subsequent SSA optimizations.
Unrestricted polyvariant duplication can lead to unacceptable code growth and I don't believe that you will be able to completely eliminate the need for some runtime dispatch (in Standard ML).
Rust's monomorphisation and treatment of the Fn
-family of traits does essentially eliminate all runtime dispatch (when not using a dyn Fn
), so effectively implements the technique described in the initial message of this issue. However, MLton doesn't monomorphise with respect to specific lambdas (the way that Rust does), so there are programs expressible in SML that aren't expressible in Rust (without resorting to dyn Fn
).
The Lambda Set Specialization paper is on by "to read" shortlist, since it does offer some evidence that more aggressive specialization up front would be a performance win.
Hi there,
I was implementing a trie/prefix tree data structure (here) and, while considering how to implement fold on the data structure, I think I noticed a conceptually simple defunctionalisation technique that doesn't require runtime dispatching on a datatype.
The kind of transformation I have in mind would turn the higher-order function
Vector.foldl
(for example) from:Vector.foldl (fn (elt, acc) => elt + acc)
to:
so that MLton's defunctoriser in the elaborate pass doesn't need to construct a sun type over every higher order function and dispatch to the appropriate function at runtime.
I would like to have a go at implementing this kind of optimisation in MLton myself, but I might give up considering my total inexperience with coding compilers 😅 so I thought it wouldn't be bad to post it here in the event I get frustrated and fail, and there is interest from others in implementing the idea.
The transformation looks general to me (can mutate values in the environment record if user wants) and doesn't seem to require anything fundamentally novel to implement, but I would be interested in hearing criticism if it's not possible too.