Open aiboyko opened 8 years ago
The rank selection in the residual-based truncation was different: in Matlab, it started from below and increased the rank, while in Fortran it started from the full rank and decreased it. I made it Fortran way in both cases. Try to pull and see if it helps.
It doesn't seem to help, although I'm pretty sure code files were updated. Unless MATLAB needs to reboot to reload files.
amen_solve2(A,b,tol,'max_full_size',15000,'kickrank',10,'x0',b,'nswp',15,... 'resid_damp',1 ,'rmax',100);
=amen_solve= sweep 1, max_dx: 1.999e+00, max_res: 1.999e+00, max_rank: 12 =amen_solve= sweep 2, max_dx: 1.442e+00, max_res: 3.240e-01, max_rank: 22 =amen_solve= sweep 3, max_dx: 2.146e+00, max_res: 5.447e+00, max_rank: 32 =amen_solve= sweep 4, max_dx: 9.203e-01, max_res: 3.143e+00, max_rank: 41 =amen_solve= sweep 5, max_dx: 3.406e-01, max_res: 2.673e+00, max_rank: 51 =amen_solve= sweep 6, max_dx: 4.106e+00, max_res: 1.783e+01, max_rank: 61 =amen_solve= sweep 7, max_dx: 4.812e-01, max_res: 2.279e+00, max_rank: 71 =amen_solve= sweep 8, max_dx: 3.319e-01, max_res: 2.111e+00, max_rank: 81 =amen_solve= sweep 9, max_dx: 4.278e-01, max_res: 1.130e+00, max_rank: 91 =amen_solve= sweep 10, max_dx: 4.316e-01, max_res: 1.017e+00, max_rank: 101 =amen_solve= sweep 11, max_dx: 2.031e-01, max_res: 2.731e-01, max_rank: 110 =amen_solve= sweep 12, max_dx: 1.977e-01, max_res: 6.424e-01, max_rank: 110 =amen_solve= sweep 13, max_dx: 1.929e-01, max_res: 3.149e-01, max_rank: 110 =amen_solve= sweep 14, max_dx: 4.607e-01, max_res: 7.568e-01, max_rank: 110 =amen_solve= sweep 15, max_dx: 4.373e-01, max_res: 1.863e+00, max_rank: 110
Then send me A and b, I will try to figure it out
It may take a while... The matrix is indefinite, so it's not a surprise that amen may not converge. It's more intriguing why the Fortran version converges. It's not exactly a bug, you may try any positive definite complex matrix, e.g. M+i*Laplace, it works. But there seem to be a subtle algorithmic difference... I hope it's in our code, not in different Lapacks.
OK, I'm starting to believe this is due to Matlab's Lapack... I've updated the fort_amen_solve_mex.F90, replacing some older version of amen by dtt/ztt_amen_solve that is now used in ttpy. The interface fort_amen_solve.m does not play a role, since we give an initial guess x0=b. So we have the same ztt_amen_solve, but invoked from Matlab. And...
u = fort_amen_solve(A,b,1e-5,'max_full_size',15000,'kickrank',10,'x0',b,'nswp',15); amen_solve: swp=1, max_dx= 1.999E+00, max_res= 1.999E+00, max_rank=12 amen_solve: swp=2, max_dx= 1.330E+00, max_res= 7.468E-01, max_rank=22 amen_solve: swp=3, max_dx= 1.298E+00, max_res= 1.173E+01, max_rank=32 amen_solve: swp=4, max_dx= 3.981E-01, max_res= 2.700E+00, max_rank=42 amen_solve: swp=5, max_dx= 5.722E-01, max_res= 5.403E+00, max_rank=52 amen_solve: swp=6, max_dx= 4.127E-01, max_res= 2.105E+00, max_rank=62 amen_solve: swp=7, max_dx= 5.570E-01, max_res= 1.729E+00, max_rank=72 amen_solve: swp=8, max_dx= 2.014E+00, max_res= 2.490E+00, max_rank=82 amen_solve: swp=9, max_dx= 4.248E-01, max_res= 1.155E+00, max_rank=92 amen_solve: swp=10, max_dx= 2.744E-01, max_res= 7.353E-01, max_rank=102 amen_solve: swp=11, max_dx= 3.327E-01, max_res= 5.698E-01, max_rank=112 amen_solve: swp=12, max_dx= 2.664E-01, max_res= 8.439E-01, max_rank=122 amen_solve: swp=13, max_dx= 2.933E-01, max_res= 1.006E+00, max_rank=132 amen_solve: swp=14, max_dx= 4.066E-01, max_res= 9.980E-01, max_rank=142 amen_solve: swp=15, max_dx= 1.927E-01, max_res= 3.627E-01, max_rank=152
... it does not converge.
How robustly reproducible is the convergence in Python/Fortran?
So Ivan believes that it's a distinctive bug in MATLAB version
so on the same matrix, with the same inputs:
x=amen_solve2(A,b,tol,'max_full_size',15000,'kickrank',10,'x0',b,'nswp',15,... 'resid_damp',1 ,'rmax',100);
ttpy amen_solve: swp=1, max_dx= 1.999E+00, max_res= 1.999E+00, max_rank=12 amen_solve: swp=2, max_dx= 1.939E+00, max_res= 4.579E-01, max_rank=22 amen_solve: swp=3, max_dx= 7.191E-01, max_res= 5.247E+00, max_rank=32 amen_solve: swp=4, max_dx= 4.561E-01, max_res= 2.237E+00, max_rank=42 amen_solve: swp=5, max_dx= 5.151E-01, max_res= 3.403E+00, max_rank=52 amen_solve: swp=6, max_dx= 4.186E-01, max_res= 3.818E+00, max_rank=62 amen_solve: swp=7, max_dx= 4.798E-01, max_res= 2.816E+00, max_rank=72 amen_solve: swp=8, max_dx= 4.060E-01, max_res= 1.799E+00, max_rank=82 amen_solve: swp=9, max_dx= 1.037E-02, max_res= 7.897E-02, max_rank=87 amen_solve: swp=10, max_dx= 7.175E-04, max_res= 1.934E-03, max_rank=97 amen_solve: swp=11, max_dx= 9.770E-05, max_res= 7.761E-04, max_rank=92 amen_solve: swp=12, max_dx= 8.662E-06, max_res= 2.994E-05, max_rank=96 amen_solve: swp=13, max_dx= 1.226E-05, max_res= 2.031E-05, max_rank=97 amen_solve: swp=14, max_dx= 5.315E-06, max_res= 1.115E-05, max_rank=90 amen_solve: swp=15, max_dx= 7.059E-06, max_res= 1.252E-05, max_rank=86
TT-Toolbox =amen_solve= sweep 1, max_dx: 1.999e+00, max_res: 1.999e+00, max_rank: 12 =amen_solve= sweep 2, max_dx: 1.158e+00, max_res: 7.469e-01, max_rank: 22 =amen_solve= sweep 3, max_dx: 2.800e+00, max_res: 6.261e+01, max_rank: 32 =amen_solve= sweep 4, max_dx: 6.230e-01, max_res: 4.422e+00, max_rank: 42 =amen_solve= sweep 5, max_dx: 5.247e-01, max_res: 1.826e+00, max_rank: 52 =amen_solve= sweep 6, max_dx: 3.155e-01, max_res: 1.740e+00, max_rank: 62 =amen_solve= sweep 7, max_dx: 5.078e-01, max_res: 1.809e+00, max_rank: 72 =amen_solve= sweep 8, max_dx: 3.439e-01, max_res: 2.840e+00, max_rank: 82 =amen_solve= sweep 9, max_dx: 5.985e-01, max_res: 8.537e-01, max_rank: 92 =amen_solve= sweep 10, max_dx: 3.544e-01, max_res: 6.569e-01, max_rank: 102 =amen_solve= sweep 11, max_dx: 3.274e-01, max_res: 4.083e-01, max_rank: 110 =amen_solve= sweep 12, max_dx: 2.944e-01, max_res: 5.549e-01, max_rank: 110 =amen_solve= sweep 13, max_dx: 3.590e-01, max_res: 4.755e-01, max_rank: 110 =amen_solve= sweep 14, max_dx: 3.269e-01, max_res: 1.476e+00, max_rank: 110 =amen_solve= sweep 15, max_dx: 2.507e-01, max_res: 4.531e-01, max_rank: 110