Open SteveBronder opened 2 years ago
This seems reasonable. Part 3 shouldn't be too bad since we already have overloading code generation working. Something like #1233 would make it even easier I think
Since we have the inliner working could we cut down on the amount of work the optimizer needs to do if we only run step 1 for functions which are used in the higher order functions?
Is your feature request related to a problem? Please describe.
It would be nice to extend the Struct of Arrays (
SoA
) framework to support UDFs so that they can be used in reduce_sum and other higher order functions.Right now if a user calls a higher order function we have to demote every matrix / vector passed to that function to Array of Structs (
AoS
). This is unfortunate sincereduce_sum
is very powerful for large independent blocks of data and parameters.Describe the solution you'd like
I think we can do this by the following
SoA
optimization pass, when the optimization hits a UDF or a higher order function it starts a sub-call of theSoA
optimization for the UDF. It will just return the list of inputs that cannot beSoA
and then continue the rest of the larger optimization pass.SoA
optimization pass the program runs another pass over the program collecting which matrices areSoA
. Then when it comes to a UDF in the program it looks at that call of the UDFs argument memory type (EitherSoA
orAoS
) and appends that set of argument memory types to a list in the UDFsmeta
record. So now each UDF defined in the functions block knows what combinations ofAoS
andSoA
arguments it needs to generate.I think the above will work? It sounds like it's only 3 steps but there's a lot of little things to do in all of those.