[FR] Allow SoA for UDFs

Is your feature request related to a problem? Please describe.

It would be nice to extend the Struct of Arrays (SoA) framework to support UDFs so that they can be used in reduce_sum and other higher order functions.

Right now if a user calls a higher order function we have to demote every matrix / vector passed to that function to Array of Structs (AoS). This is unfortunate since reduce_sum is very powerful for large independent blocks of data and parameters.

Describe the solution you'd like

I think we can do this by the following

During the SoA optimization pass, when the optimization hits a UDF or a higher order function it starts a sub-call of the SoA optimization for the UDF. It will just return the list of inputs that cannot be SoA and then continue the rest of the larger optimization pass.
At the end of the SoA optimization pass the program runs another pass over the program collecting which matrices are SoA. Then when it comes to a UDF in the program it looks at that call of the UDFs argument memory type (Either SoA or AoS) and appends that set of argument memory types to a list in the UDFs meta record. So now each UDF defined in the functions block knows what combinations of AoS and SoA arguments it needs to generate.
When the program starts printing out the C++, it will go through each UDF's list of memory patterns and generate a signature and body for each

I think the above will work? It sounds like it's only 3 steps but there's a lot of little things to do in all of those.

stan-dev / stanc3

[FR] Allow SoA for UDFs #1237