Proposal: going to sparse matrix in tomography tools

wave46 commented 1 year ago

Dear Cherab team.

The output matrices in, for example, admt_utils.py are usually very big and sparse and need lots of memory to store. How do you think, if it would be profitable for both memory usage and computation efficiency to go to sparse matrix notation using scipy.sparse?

Thanks, Ivan Kudashev

jacklovell commented 1 year ago

I did consider whether it was worthwhile making these sparse matrices when first implementing those utilities (which were initially copied from demo functionality scripts where optimisation was very much not a consideration). In the end I went with standard array types because it was a) what had already been implemented, b) simpler to debug and c) the inversion routines these matrices were used with (e.g scipy.optimize.nnls and cherab.tools.inversions.invert_regularised_sart) required non-sparse matrices anyway, so the memory savings would have been lost when they actually came to be used.

For the inversion grids of the JET and MAST-U inversions I was using these for the matrices weren't unreasonably large. I can see that if you want to do a high resolution ITER inversion for example then this may become an issue, but at that point O(N**3) inversion routines probably aren't suitable either so the requirement for non-sparse matrices is dropped.

Do the classes in scipy.sparse support the same operations as standard 2D arrays transparently? Would there be any modification needed to the utilities in admt_utils.py other than changing the initialisation of these arrays from np.zeros to a suitable sparse matrix initialiser? If that's the only change we could add a new boolean function argument sparse which would return sparse matrices when True and normal matrices when False, and it would default to False to maintain backwards compatibility. If more significant modifications are required I'd suggest adding new functions which geneate the sparse matrices: users would then call these new functions when they want sparse matrices and the original functions otherwise.

wave46 commented 1 year ago

Thank you, @jacklovell for your answers.

I agree with you, that finally for using inversion tools one needs a dense matrix. Some of the classes in scipy.sparse (e.g. csr_matrix, 'lil_matrix') handle the same notation as normal matrices. However, it seems to me, that just changing initializing without changing the way how the matrices are defined will be even slower, because than each time the sparsity of the matrix is changed. Otherwise, one should first carefully store all of the indices and corresponding non-zero values as arrays and after that define a matrix.

So, summing the fact that one still needs dense matrices for the inversions and it that the initialization of the operators in admt_utils.py should be changed quite a lot, I think that dealing with sparse matrices can be left to user or, indeed, new functions should be made.

cherab / core

Proposal: going to sparse matrix in tomography tools #380