Open martindurant opened 2 months ago
If all of the functors are structured, like map
, filter
, reduce
, then you can do better than ArrayBuilder by making the Numba-compiled function generate an index and then apply that index to the array as a slice.
You could also add an axis
argument to this and have it apply at some depth using ak.transform (having all structure above where it's applied stay the same—but the transformation has to be length-preserving). That would solve a whole class of problems in which someone wants to take apart a structure, change something, and then rebuild everything above the changed part the same way.
@jpivarski : the
query
function here works on the play data generated by nested-pandas in 10x the speed compared to the typical approach we discussed, even with the UnmaskedArray PR.Generate the play data:
Times:
Note that here we make a masked array, so it has exactly the same structure as the original (swapped) array, but where the filter fails, you get None. Else you would need
ak.count
, which takes about 50ms.It feels like it should be possible to do this really efficiently with
ArrayBuilder
andnumba
? You would need to have a way to turn the "query" into something you can execute in the loop.