Hierarchical 1D networks

RikVoorhaar commented 3 years ago

I have developed some code for hierarchical networks in quimb where all the tensors except the root are isometries (think MERA's and hierarchical tucker). The idea was to develop methods to truncate hierarchical networks with loops (in particular MERA's). Unfortunately the truncation algorithm doesn't have great performance on practical problems, but a lot of the code might still be useful.

In particular I have implemented:

causal cones and efficient computation of environments
a HOSVD algorithm
truncation (works great in loop-less case)
ALS optimization of tensors in the network
Riemannian optimization of subsets of the network (using pytorch and geoopt)
Orthogonalization for tensors that are almost isometries (works great in loopless setting, not as well in general setting)
Hierarchical tucker format, including good quality truncation, optimized computation of environments, and initialization of the format from dense tensors (I should also add initialization from TT, and the ability to change the tree structure)
specific code for MERA's more general than what is currently implemented in quimb (right now it seems only periodic MERA's on 2^n sites are implemented), also code to convert a hierarchical tucker to MERA (by inserting identity disentanglers)

Do you think this will be useful to add to this repository? If so I would love to contribute. Right now most of the code is specific for numpy and pytorch tensors, but I don't think it will be too much work to generalize this.

jcmgray commented 3 years ago

Hi @RikVoorhaar,

That sounds like a lot of cool stuff! In terms of inclusion in quimb I guess the main thing to think about is what a high-level interface or usage of each would look like? I.e. what stuff can be modularized / generalized or should everything be encapsulated in some e.g. HierarchicalNetwork class.

So a hierarchical network is like a tree tensor network, but with isometric constraints on the tensors? Since my crude understanding is that TTNs are already easy to contract and the isometric constraint is quite limiting in the many-body quantum setting, out of interest what's the motivation for these?

A few questions or notes (e.g. some things that might not be documented in quimb yet):

For any tree tensor networks, you can use .canonize_around and .compress_between for optimal truncation, but similarly to MPS I'm guessing there is some nicer way to sweep around everything at once? Or does the unitarity change things as well?
tensors can have a .left_inds attribute after which the .unitize method can be called to project them into a isometry. Since this is compatible with the autodiff qtn.TNOptimizer stuff it is already possible to do global optimization of tensor networks with isometric constraints in quimb - not sure if you have tried this approach yet
quimb uses autoray to generalize to backend array libraries
Having some tensor decompositions (be it HOSVD or tucker / hierarchical tucker might be cool to have), but possibly delegating to a more specialized package like tensorly is the best approach here
Yes, the MERA code is currently pretty basic and mostly just as an illustration of building more complicated networks, in principle its set-up to do e.g. global autodiff optimization using causal cones but I haven't investigated this much
A generic method to compute all exact environments at once and do ALS optimization might be interesting (my hunch is this likely worse than global autodiff optimisation but it might have lower memory requirements).

RikVoorhaar commented 3 years ago

My idea was to write a HierarchicalNetwork class implementing all the methods that work in that generality, and then perhaps another class HierarchicalTucker for methods specific to HT, i.e. recreate some of the stuff from the htucker matlab library. There is no good Python implementation of Hierarchical Tucker format to my knowledge, and after trying a few frameworks quimb seemed like the nicest to work with.

The motivation for all of this was to see if we can use MERAs for low-rank tensor decompositions of e.g. solutions of PDE's or grid-sampled high-dimensional functions. The TT and HT formats have been used for this purpose with success, and the hope was that adding loops to the network would lower rank even further for some problems. Most of the methods I developed for this purpose work in a bit more generality than just MERAs, and the notion of a hierarchical tensor network with isometry constraints appears naturally.

For any tree tensor networks, you can use .canonize_around and .compress_between for optimal truncation, but similarly to MPS I'm guessing there is some nicer way to sweep around everything at once? Or does the unitarity change things as well?

Forcing all the tensors to be isometries makes computing environments much faster. The environment only depends on the causal cone of the tensor, which tends to grow in size as a constant times the depth in the network. Furthermore truncation becomes simpler, since optimal rank-r truncation of a tensor T is simply given by U[:,:r]V[:r,:] if USV is the SVD of the environment of T. For hierarchical networks there is a straightforward leaf-to-root sweeping that requires us to just compute all the environments once to do truncation.

tensors can have a .left_inds attribute after which the .unitize method can be called to project them into a isometry. Since this is compatible with the autodiff qtn.TNOptimizer stuff it is already possible to do global optimization of tensor networks with isometric constraints in quimb - not sure if you have tried this approach yet

The idea is to do optimization with a network that is already isometric, but indeed something like '.unitize' is what I was using.

quimb uses autoray to generalize to backend array libraries

Yes! I was going to try to rewrite my code to use autoray. The API seems good, so this shouldn't be too much work.

Having some tensor decompositions (be it HOSVD or tucker / hierarchical tucker might be cool to have), but possibly delegating to a more specialized package like tensorly is the best approach here

Implementing hierarchical tucker for tensorly also sounds like a good idea. I'm not very familiar with that toolbox, but I can ask if there is interest in this.

Yes, the MERA code is currently pretty basic and mostly just as an illustration of building more complicated networks, in principle its set-up to do e.g. global autodiff optimization using causal cones but I haven't investigated this much

Right, I think my code is able to generalize this a bit further. It might be nice for researchers using MERAs, since it should be less work to implement variations on the MERA format implemented in quimb.

A generic method to compute all exact environments at once and do ALS optimization might be interesting (my hunch is this likely worse than global autodiff optimisation but it might have lower memory requirements).

I haven't played much with global autodiff since it seems that for larger networks this is prohibitively expensive, and in many problems ALS has good performance.

jcmgray commented 3 years ago

Forcing all the tensors to be isometries makes computing environments much faster. The environment only depends on the causal cone of the tensor, which tends to grow in size as a constant times the depth in the network. Furthermore truncation becomes simpler, since optimal rank-r truncation of a tensor T is simply given by U[:,:r]V[:r,:] if USV is the SVD of the environment of T. For hierarchical networks there is a straightforward leaf-to-root sweeping that requires us to just compute all the environments once to do truncation.

I was thinking more that, for any TTN, you can shift the orthogonality center around, just like TT/MPS, so that the environment is always exactly the identity and truncation is optimal, even without any isometric constraints - I can imagine with isometric constraints its even more efficient though.

Right, I think my code is able to generalize this a bit further. It might be nice for researchers using MERAs, since it should be less work to implement variations on the MERA format implemented in quimb.

Absolutely.

Well a contribution would be welcome! Let me know if you need any pointers or clarifications, or feel free to open a 'work in progress' style PR and I can try and give some early feedback. I'm currently just in the process of finalising a docs page describing in detail the internal design of quimb, which might be useful so I'll drop a link to it here once pushed.

jcmgray commented 3 years ago

Here are those docs -

https://quimb.readthedocs.io/en/latest/tensor-design.html

RikVoorhaar commented 3 years ago

I would still love to work on this, but I don't think I can find the time for it any time soon. I suggest to close this issue, and if I start working on this project again we can reopen (or if someone else would like to do something along these lines).

jcmgray commented 3 years ago

Sure thing, I am happy to keep it open for visibility.

jcmgray / quimb

Hierarchical 1D networks #69