scipp / scippnexus

h5py-like utility for NeXus files with seamless scipp integration
https://scipp.github.io/scippnexus/
BSD 3-Clause "New" or "Revised" License
3 stars 3 forks source link

General benchmarks / profiling #102

Closed SimonHeybrock closed 1 year ago

SimonHeybrock commented 1 year ago

In the current implementation, no major effort was put into optimization. Basically, the implementation is "naive" in most cases, which might, e.g., result in repeated or redundant calls to h5py, etc.

We should profile ScippNexus for the "typical" cases, i.e., files with hundreds of groups and thousands of datasets. Attention should be payed not just to loading large datasets, but first and foremost all of the "overhead" from dealing with small but many file contents.

SimonHeybrock commented 1 year ago

Some h5py performance metrics. Numbers are very approximate, this is just to see the order of magnitude:

Op Time [μs] Comment
open dataset 1000
get (nested) group or dataset 300 why sometimes 1000 for datasets?
ds.shape 120
ds.shape (cached) 17
ds.attrs 100
dict(ds.attrs) 250 len=0
dict(ds.attrs) 1000 len=1, units='m'
ds.attrs.get('units') 500
dict(ds.attrs) if ds.attrs else dict() 100
key in group 120 dict is 10x faster

In typical NeXus files we have:

Note however https://arxiv.org/pdf/2112.00228.pdf, which considers cases with more than 100000 entries. Maybe a typical case would have O(10000) in practice?

Putting the pieces together, we may thus expect and "overhead" (not counting actually reading the datasets) of a couple of milliseconds per item, which will mean a couple of seconds or up to a minute in total, assuming one access to each group/dataset/attr.

SimonHeybrock commented 1 year ago

Results after the recent "rewrite":

# file, scippnexus, scippnexus.v2, speedup
2023/DREAM_baseline_all_dets.nxs         1.99   2.02  1x
2023/BIFROST_873855_00000015.hdf         5.95   0.74  8x
2023/DREAM_mccode.h5                     2.79   0.77  4x
2023/LOKI_mcstas_nexus_geometry.nxs      0.17   0.05  4x
2023/NMX_2e11-rechunk.h5                 7.52   2.05  4x
2023/YMIR_038243_00010244.hdf            3.12   0.11  28x

Note that v2 has seen some behavior changes compared to the current version, so there is no 100% equivalence.