Function sharing - Githubissues

chambart commented 8 years ago

A lot of functions are identical, sometimes coming from different instances, or sometimes generated stubs for partial applications.

Sharing equal functions in general is not easy:

For functions using specialised arguments, there is a specific complexity as two functions equal but with different sets of those cannot be shared if we can't prove that their argument is aliased, or useless (no value that can contain a closure have been propagated through it in particular). In the case that the argument is aliased, it must also go to a parent context of both definitions where the value to which the argument is alias is bound.
Proving that two functions are equal is not trivial, even structural equality: there is no good (and simple) binding order in general. for instance if two functions have a non empty closure then the order of the bindings of the elements of the closure is not defined, and trying every order is O(factorial).

In some cases, the code of two functions is equal, without having the same annotations at all. It is probably easier for that kind of cases to do the sharing later (during flambda_to_clambda for instance).

lpw25 commented 8 years ago

I have some thoughts about this. I think that we could separate out the code for functions from the closures (something like a Code construct in named). The construct representing function code would have sets of variables for the free variables, parameters and the function itself. Both the parameters and the free variables would have approximations (or as I insist on thinking of them: types) attached to them. The construct representing closures would have code fields pointing to the codes of the functions in the closure, a map from the free variables of these functions to variables in the current scope, and approximations for the parameters of the functions (at least as informative as those attached to the code, but they could also contain additional information -- this is a more expressive version of specialised_args).

Simplifying a code construct would use the approximations in it to simplify the body. Simplifying a closure construct would be similar to inlining -- it would attempt to simplify the body of the function further using the additional information (i.e. the approximations of the bound variables from the environment and the approximations of the parameters in the closure) if the benefit of this simplification is worth the cost of duplicating the function body then it would create a fresh code construct containing the simplified body and use that instead of the old one.

The sharing of functions would come from inlining and specialisation (the main source of duplication at the moment). Inlining a function that contained a closure construct would create a new closure construct that pointed to the same code as the original closure construct in the function's body.

This would also let us inline more functions since inlining a function not necessarily incur the cost of duplicating the function bodies of its closures.

mshinwell commented 8 years ago

Related to #128

OCamlPro / flambda-task-force

Function sharing #140