CP Block Coordinate Descent

A very interesting blocked version of the CP decomposition which under the hood uses ALS with fixed block sizes. Very similar to the paneled ALS method however, the difference comes from the following: After optimizing a block of size r<=R_{target} you compute the gradient of the target tensor. The following block of size r' now tries to approximate the gradient. This continues until there are no blocks left. if r = R, i.e. one block, one recovers the exact CP-ALS decomposition.

It is very interesting for two reasons:

Using BCD allows one to fix the rank of the CP decomposition effectively eliminating it from the arithmetic intensity and replacing it with a prefactor of r * number of blocks.
The overall decomposition accuracy is dependent on the blocksize. Choosing a blocksize too small will significantly reduce the accuracy of a decomposition. However, there seems to be a way to "saturate" the block-wise accuracy using a block-size r < R_{target}. This implies that some tensors (using tensors in general may be too strong assumption) are made up of a fixed number of component tensors T_{i,j,k} = A_{i,j,k} + B_{i,j,k} + C_{i,j,k} + \dots which themselves have relatively low CP rank. These component tensors are "discovered" as the gradient of T.

I have done limited testing with this method beyond getting it to work using a small DF coulomb integral tensors (water in the basis aug-cc-pVDZ/aug-cc-pVDZ-RI) and the random test tensors.

ValeevGroup / BTAS

CP Block Coordinate Descent #157