momentoscope / hextof-processor

Code for preprocessing data from the HEXTOF instrument at FLASH, DESY in Hamburg (DE)
https://hextof-processor.readthedocs.io/en/latest/
GNU General Public License v3.0
7 stars 4 forks source link

Binning performance #75

Closed steinnymir closed 3 years ago

steinnymir commented 3 years ago

From the discussion in the issue #73 I showed the binning performance as function of number of cores used, which showed a poor performance with increased number of cores used. This turns out to be wrong however, as this was tested on dataframes with too few electrons.

Working with larger datasets, in the order of 200M electrons, the binning performance greatly improves, up to more than 40 cores at least. It showed, in this case, a speedup of a factor 4 at least between using 4 and 32 cores. 94 cores seemed slightly faster, but I did not make a quantitative study of this.

This should be further investigated, looking for a sweet spot for the number of cores, or a variable number of cores, based on the size of the dataset could be a cool alternative.

zain-sohail commented 3 years ago

I think that in such a case, an adaptive approach to the number of cores makes most sense. Your testing approach can be used but for a few different sized datasets, and we can interpolate a num_cores depending on dataset size then.

zain-sohail commented 3 years ago

With the current data structure, the binning performance has vastly improved. Henceforth, I don't think we need to look at improving it further. If you think otherwise, please reopen this issue.