Closed chaserileyroberts closed 4 years ago
you can assign me to this
@Thenerdstation @amilsted @viathor @MichaelMarien I think we should have a serious discussion about how we are planning to implement this. I am working towards implementing DMRG, TDVP and a few other 1d tensor network algorithms into the library. Ideally I would like to build it such that only minimal changes are required once we support symmetric tensors. For this we have to first settle on how we want to support it.
The only sane way of organizing block sparsity is by packing blocks into a class SparseTensor
. This class also needs to take a backend argument. SparseTensor
holds an array of tensors of the according backend type. We could probably also repurpose Node
to fill the role of SparseTensor
. In any case SparseTensor
implements the API that high-level operations are using.
An immediate issue is that all backend functions (ones, zeros, randn, tensordot, ...
) are incompatible with SparseTensor
, meaning that we would have to redesign each backend to be able to handle both sparse and dense tensors of the according backend.
I would suggest that in this case we resort to internally supporting only SparseTensor
, and handling dense tensors as special cases of sparse tensors with only one block (these details can be hidden from the user). We then build a linear algebra module which implements the following backend independent linalg operations
which exclusively act on SparseTensor
. Behind the scenes we will still use our current backends on the level of the individual dense tensors. This could be hard for e.g. sparse solvers if we want support TPU with tensorflow or jax (due to possible complications under jax.jit or tf.function).
This approach also has a potential drawback: we likely need to store dense blocks in a list (occupying different memory locations). For high rank sparse tensors, the number of such blocks can become huge: e.g. for a rank 7 tensor (very common in MERA) with 10 charge sectors (a very typical situation in DMRG) on each leg, this typically yields 10^7 non-zero blocks. Operations like transpose, tensordot, svd, ...
can become very slow because we need to iterate through a very long list in python. People typically port such things to e.g. C++ to obtain considerable speed up. In our case, this may be very hard/impossible to do. In the long run, it thus would make sense to design a symmetric tensor class in C++ and expose its interface to python.
We should also decide if GPU/TPU is a priority for SparseTensor
support or not. It is conceivable that sparse problems benefit substantially less from accelerated hardware. For example, in symmetric DMRG, one typically ends up with O(10) - O(100) charge sectors, but each dense block
has modest bond dimension O(100).
closing this issue because we have added the support
Block sparsity is needed for many physics applications.