Accessing arrays with contiguous storage has a significant cost compared with sparse storage.
For example, when using contiguous storage for inputs and output for the (a + b + c) / 3 expression, without compression (to factor out this param), we get the next performance:
whereas when using sparse storage, we get this:
Which is very close to the Numba performance, and hence, to max performance.
This gives support for the fact that we are using sparse storage as the default when using memory, whereas contiguous is used when using disk. However, it might be worth exploring which is causing the big slowdown when using contiguous in memory. My current guess is that the index of offsets is compressed by default, and that creates some slowness; but this should be verified with more systematic benchmarking and profiling.
Accessing arrays with contiguous storage has a significant cost compared with sparse storage.
For example, when using contiguous storage for inputs and output for the
(a + b + c) / 3
expression, without compression (to factor out this param), we get the next performance:whereas when using sparse storage, we get this:
Which is very close to the Numba performance, and hence, to max performance.
This gives support for the fact that we are using sparse storage as the default when using memory, whereas contiguous is used when using disk. However, it might be worth exploring which is causing the big slowdown when using contiguous in memory. My current guess is that the index of offsets is compressed by default, and that creates some slowness; but this should be verified with more systematic benchmarking and profiling.