Closed lgray closed 10 months ago
@nsmith- please review when you have time, thanks!
and just to be sure you're fine with the dask.delayed
object getting cached (this was to enforce re-use on multiple calls to wrap the correction, otherwise it keeps generating a new key in the graph / more payload).
Mostly just an issue of thread safety, but I don't imagine people using correctionlib in python threads (as opposed to processes) that much.
I'm much more scared of attempting to persist the dask.delayed
object in the library code. Just having it wrapped is fine I think.
Yeah what I've implemented here was more or less what Martin suggested so far as dask usage patterns are concerned. No need to persist if you wrap it in the delayed object. It'll be handled by any scheduler that conforms to the spec. This is also what's being done over in coffea for corrections and ml models after his suggestion.
Also add awkward wrapper to CompoundCorrection.
This PR lets us pass dask_awkward.Array into correctionlib corrections. It does the wrapping of the correction into a delayed object and map_partitions call internally now.
Is significantly cleaner than the
map_partitions
version.