Improvements needed in the Ad MergedOperator concept

This is, for now, a discussion on possible improvemets to the concept of MergedOperators within the Ad framework. Hopefully, an implementation plan will emerge.

Background: The role of `MergedOperator`s

The MergedOperator class is a wrapper around spatial discretization matrices defined on (in general) multiple grids. Discretization classes generally produce multiple discretizations, e.g. Mpfa('foo').flux, Mpfa.bound_flux etc., and during the Ad wrapping of a discretizaiton, a MergedOperator is produced for each of these discretization fields. Discretization classes with different physics keys (Mpfa('foo') and Mpfa('bar')) produce distinct sets of MergedOperators. Importantly, MergedOperator is defined defined on a set of grids, and thus serves as a bridge between the operation of discretization, which is defined on individual grids, and the mixed-dimensional formulation of equations.

The MergedOperator has no functionality to discretize, although it has a reference to its underlying discretization. Parsing of a MergedOperator entails a loop over its domain of definition which collects discretization matrices (assumed to have been computed) and assembly of these into a block diagonal matrix.

Shortcomings

EK comments:

The below list is my best attempt at formulating what are really vague ideas of what I would like to change. Hopefully this will be better articulated in the future.
The remarks are partly specific to the MergedOperators, but partly cover the connection to discretization classes as well.

The current formulation is problematic in terms of CPU and memory requirements: Right now, the discretization matrices are stored in the data dictionaries of individual grids, before, while parsing effectively copies the full matrix into a block matrix (this is the way scipy works; future usage of PETSc may allow for more flexibility). There are several oportunities for improvements: i) Define a central storage for the matrix instead of distributing it among the grids. This will remove the need for copies of the matrix, but will also make updates of the part of the discretization associated with individual subdomains more complex. It is unclear how this will work in a future setting where we have a PETSc matrix backend, potentially in a distributed memory model. ii) For some discretizations (e.g, two-point expressions), the discretization can be stored in terms of a array associated with faces, rather than as a full matrix. This will substantially reduce memory requirements related to storage (ex: for Tpfa, there will be an array of floats with size equal to the number of faces, rather than the same number repeated for the two cells, together with integer arrays of indices and index pointers). Moving the implementation in this direction will may also enable just-in-time discretization , rather than the current ahead-of-time approach. This is however not desirable for all discretization methods (read mpfa/mpsa).
Currently, the discretization classes construct all associated matrices, with no regard to whether these will be needed. This can have a noticable overhead in both CPU and memory requirements. From an analysis of the set of equations in a model (enabled by #1181), it should be possible to identify which discretizaiton objects are actually needed, and compute only these. This will however partly invert the relation between a MergedOperator and its underlying discretization, and may indicate that the current structure is not ideal.

pmgbergen / porepy