Closed lgray closed 1 year ago
@chrispap95 OK - sorry for the noise here while I beat it into shape. This is all ready for completion now. Let me know if you have any questions!
There are perhaps a few places where it's not a trivial map_partitions but I haven't thought about it too hard.
We probably also need to turn on the thread safety options for fastjet now, since dask's default scheduler is threaded.
@chrispap95 ok - finished this up. It would be super useful to try this out with some user analysis and derive some tests before we merge it.
@andrzejnovak FYI
@lgray I can run the SUEP workflow on it. It is using only inclusive_jets
and constituents
though. Let me try to set up the environment and I will reply back.
@chrispap95 cool - dask-based NanoEvents in the awkward2_dev branch of coffea should work for PFNano so you may just be able to start from that.
@jmduarte I know you guys are using some of the lund plane functions and stuff - if you have some time could you give this new software stack a try?
There are some bugs in this that need fixing:
OK - this is going to need some work now that I've got some tests running on a file.
Update: I have a janky fix that seems to work on files well.
@chrispap95 can you make a 2 event sample of that PFNano file you pointed me to so that I can make proper tests? There's a lot of gotcha's in dask_awkward related to when you finally start reading from a file (and doing so efficiently).
@jpivarski Do you think this is reasonably robust enough? I checked in the code that deeper within all that is ever asked for is E/px/py/pz, but it all looks sorta sketchy IMO.
uuughhh
Fixes #188
Bindings appear to be pretty straightforward at the end of the day.
The trick we use to make delayed operations work is to construct a new ClusterSequence for each concrete array being processed.
Within DaskAwkwardClusterSequence we dispatch the function to be called and the JetDefinition.
There are a large number of functions to convert, but the majority of them seem pretty straightforward.