huttered40 / capital

Distributed-memory implementations of novel Cholesky and QR matrix factorizations
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Further optimization to cacqr with complete_inv=false #36

Open huttered40 opened 4 years ago

huttered40 commented 4 years ago

We won't benchmark with this case, since I'm skeptical it will outperform cacqr with complete_inv=true, but I'd like to have this working nonetheless. I tried to get it working, but it seemed to fail, although I checked the Q factor element by element on rank 0 and it seemed to be correct (it matched the correct one). Perhaps there was an error on a different process?

Here is a picture of the code I was working with: Screen Shot 2020-01-04 at 11 23 28 PM