Open CSSFrancis opened 4 months ago
I remember that to have come it when you did the work on map
and ragged
(possibly mentioned in our discussion or from review of the state of the art) and I was unsure at the time if this was worth it using awkward-array
in favour of sticky with numpy array.
Now that it seems that this is used more widely in the scientific community (not only developed for the high energy physics community) and integrate with the other usual suspect (dask, cupy, numba, etc.)
It may be worth considering this thin wrapper around Awkward-array: the ragged library - see https://github.com/scikit-hep/ragged/discussions/6 for discussion on the difference between the two.
It would need a champion to push for this! 😃
Describe the functionality you would like to see.
The
ragged
implementation in hyperspy currently acts as a bit of a bottle neck and this is largely a result of the object implementation in numpy. For one it is slow which slows down vector operations which should be realitively quick! It also doesn't have a good definition for different flavors of ragged arrays. For example:This makes visualization and axes definition difficult. In case 1 we might want to pass axes with no "size" parameter. In case 2 we might want a size along 1 dimension and not along the row dimension. In case 3 we just want a "ragged" definition.
I think the solutions would be to implement awkward-array
Awkard arrays are fast: https://awkward-array.org/doc/main/getting-started/what-is-an-awkward-array.html#high-performance
Awkward arrays have more defined shapes/structures: https://awkward-array.org/doc/main/getting-started/what-is-an-awkward-array.html#versatile-arrays
Awkward arrays has (some) dask-implementations: This part I don't know if I love. The dask implementation might need some additional work