Closed IdRatherBeCoding closed 1 year ago
I am also interested in this. In fact, there is a related discussion here: https://github.com/facebookresearch/theseus/discussions/609
The LU solver at least is rather simple to use and can be called with something along the lines of the following snippet (simple adaptation needed for batched input):
def test_solve_lu(A, b):
batch_size = 1
solver = CusolverLUSolver(batch_size, A.shape[1], A.crow_indices(), A.col_indices())
singularities = solver.factor(A.values().unsqueeze(0))
print("singularities:", singularities)
x = b.clone().unsqueeze(0)
solver.solve(x)
return x.squeeze()
If the backward function is required as well, I am not sure this can be done directly with Theseus as the higher level wrapper LUCudaSolveFunction
actually solves the normal equation. You could however use CusolverLUSolver
in combination with the approach we use here:
https://github.com/cai4cai/torchsparsegradutils/blob/main/torchsparsegradutils/sparse_solve.py#L223
Putting this together is something I want to look into at some point: https://github.com/cai4cai/torchsparsegradutils/issues/49
@tvercaut Thanks for your answer - we will try this too!
Hi @IdRatherBeCoding and @tvercaut. @maurimo (the main author of Baspacho) wrote a similar example on using Baspacho in 2c6a8ce3daf0a255f639630018fe17478c00c7aa. Hopefully you can find this useful as well. Let me know if you have any questions. (ps: @tvercaut sorry for the delay, he was only able to find time for this the past weekend).
Considering the interest, I'm likely to make clean API to expose differentiable versions of these solvers the next major feature we add, it's mostly the issue of finding some time to do it, as we have other projects consuming most of our bandwidth.
The support is appreciated, thank you. For our use case the example code with 1x1 blocks has impressive scaling with batch size, but the overhead of setup and relatively slow solve is too high for us. I have tried larger (uniform size) blocks, but this doesn't change much. Do you happen to have any advice on approaches to block sizing for general matrices? (e.g. Laplacians).
Otherwise we look forward to some of the efficiency improvements mentioned in the baspacho README :)
@IdRatherBeCoding Unfortunately, @maurimo himself pointed out the slowness setup issue when using default 1x1 blocks, and BaSpaCho currently has no automatic way to determine this, so still is up to the user to setup an appropriate block structure.
@maurimo do you perhaps have any advice on block sizing for general matrices?
Hello, sorry for late reply! So, BaSpaCho is built around the matrix being block-structured, of course it is a double-edged sword as it will be (a bit...) counterproductive if the matrix doesn't have a block structure. But in general (and especially when doing optimization) matrices actually are block structured, with blocks corresponding to variable pairs, and variables being often 3D or at least 2D vectors. I also have to say that I have made sure that factor was as optimized as possible (as in my use case it's invariably the bottleneck) and I have put little attention on solve and even less on the setup, contributors welcome :-p. Also for Theseus the setup is done only once so this is a win most often than not. I hope I will eventually find some time to work on improving those ops too, eventually. That said, if your matrix is a Laplacian don't you have at least 2D variables, making the blocks 2x2? if the matrix is really not block structured there isn't much you can do other than using 1x1 blocks (or using zero fill-in but this is likely to make things worse). If you are not building the matrix you can "discover" a block structure using a hashing trick: you build a list of random "Zobrist" keys Z_j (as many as the order N of the matrix), and for each i you compute H_i as the sum of Zj for all i,j pairs such that M{i,j} or M_{j,i}. Then if H_i = H_k it means that those columns/rows actually belong to the same block, and you can apply a permutation to make them become consecutive, and define a non-trivial block structure. Sorry if it's not very clear let me know if this trick might be useful to you and I will provide a proof of concept code!
Thanks for this input, @maurimo . I tried your suggestion to identify blocks but it did not come up with anything - but maybe I misunderstood something. Instead, I did try converting the matrix to a banded structure and was able to get some improvement using blocksize > 1 for the banded matrix.
Hello.
Interested in using your pytorch wrapper of
baspacho
for solving multiple Ax = b problems (as opposed to least squares). I can see tests usingSymbolicDecomposition
, but I don't understand how to prepare the arguments.It would be really appreicated if you could provide an example of how to use
SymbolicDecomposition
to solve simple linear problem Ax=b for, say, a batch of 2 input scipy csr matrices with the same sparsity pattern.