Closed graeme-a-stewart closed 1 year ago
Hi @grasph - I tracked down the bug that was causing your implementation to deviate from the FastJet results. If a tile had no valid jets it still returned one value in the iterator of noTiledJet
, which sometimes caused a jet with no neighbours to merge with the beam prematurely. This fix costs about 4% runtime.
I have also applied the nice optimisation to the search for the lowest dij value at each iteration, using LoopParallelisation. This saves a lot and the final code is now about x1.45 faster than the FastJet N2Tiled algorithm. (251us/event vs. 173us/event on my MacBook). I was a little concerned that the final dij picked here might have a race condition, but I ran it hundreds of times and it aways gets the correct value.
Just doing some more benchmarks on x86 boxes, the speed up is not quite as high there (tested on one Intel and one AMD), with a 16% gain. Still allows us to beat FastJet, but only by a small margin, x1.06.
Excellent!
Fix a bug in the tile iterator, so that if a tile has no jets (i.e., the first "jet" is not valid) then the iterator returns nothing
Speed-up the search for the best dij value by using the LoopVectorisation package
Some small fixes for comments and additions to .gitignore