xiaoyeli / superlu_mt

Other
20 stars 12 forks source link

Increasing memory via sp_ienv and Different results per run with OPENMP WIN32 FORTRAN call comparing to superlu ordinary #12

Open arypramudito opened 9 months ago

arypramudito commented 9 months ago
  1. How to get increasing the memory superlu. I am using fortran windows via Intel OpenAPI

F:\sparse\mt\fortran>f77mod.exe < cage12.rb nproc = 12 threads used Use minimum degree ordering on A'*A. Storage for L subscripts exceeded; Current column 24279; Need at least 60980958; You may set it by the 8-th parameter in routine sp_ienv(). Memory allocation failed at line 222 in file pmemory.c

I added some openmp in f77_main.f examples ` nprocs = 2

  !$omp parallel
  !$omp master
  !$ nprocs = omp_get_num_threads()
  !$omp end master
  !$omp end parallel
  print *, nprocs , ' threads used'`

PS: My bad is already on FAQ. SuperLU_MT: exceeding storage errorSuperLU_MT does not support dynamic memory expansion. It pre-allocates storage based on the nonzeros in original matrix A and the estimated fill ratios given by the #6, #7, #8 parameters n SRC/sp_ienv.c. If the guestimate is small, you may get the following error message. Storage for L subscripts exceeded; Current column xxxx; Need at least yyyyyy; You may set it by the 8-th parameter in routine sp_ienv(). Then, you need to set the corresponding parameter to be a larger value in magnitude. Section 3.5.2 of the Users' Guide explains this in more detail.

means i must edit sp_ienv.c or i can use enviroment variables? change here?

else

case 1: return (8);
case 2: return (1);
case 3: return (200);
case 4: return (200);
case 5: return (40);

endif

    case 6: return (-20);
    case 7: return (-100);
    case 8: return (60980958); 
}

/* Value for ISPEC(8) before        case 8: return (-10)*/
  1. Somehow the result superlu_mt changes every ran, and different with superlu from calling c_fortran_pdgssv.obj, when dealing matrix larger https://www.cise.ufl.edu/research/sparse/matrices/Goodwin/index.html https://www.cise.ufl.edu/research/sparse/RB/Goodwin/rim.tar.gz openmp dwin32 intel_mkl both VS CL and INTEL ICX with IFORT

program f77_main integer maxn, maxnz parameter ( maxn = 2132536, maxnz = 2132536) integer rowind(maxnz), colptr(maxn) real8 values(maxnz), b(maxn) integer n, nnz, nrhs, ldb, info integer nprocs call hbcode1(n, n, nnz, values, rowind, colptr) nrhs = 1 ldb = n nprocs = 2 do i = 1, n b(i) = 1 enddo call c_bridge_pdgssv(nprocs, n, nnz, nrhs, values, rowind, colptr, $ b, ldb, info) if (info .eq. 0) then write (,) (b(i), i=1, 10) else write(,*) 'INFO from c_bridge_dgssv = ', info endif stop end

RESULT

E:\intel\superlu_mt-master\superlu_mt-master\SRC\now\built>f77_main.exe < rim.rb Use minimum degree ordering on A'*A. Factor time = 0.77 Factor flops = 2.114700e+09 Mflops = 2764.10 Solve time = 0.04 Solve flops = 1.983820e+07 Mflops = 537.72

NZ in factor L = 2904678

NZ in factor U = 7036982

NZ in L+U = 9919100

L\U MB 109.312 total MB needed 127.179 expansions 0 9716729898987.25 19855082276513.7 22103722855047.9 11551746912007.8 14702398418294.2 21506934209532.5 13996173763919.8 11130599870742.7 14099676818866.4 -8671219949845.16

E:\intel\superlu_mt-master\superlu_mt-master\SRC\now\built>f77_main.exe < rim.rb Use minimum degree ordering on A'*A. Factor time = 0.75 Factor flops = 2.114960e+09 Mflops = 2820.74 Solve time = 0.04 Solve flops = 1.983882e+07 Mflops = 544.51

NZ in factor L = 2904987

NZ in factor U = 7036982

NZ in L+U = 9919409

L\U MB 109.310 total MB needed 127.177 expansions 0 8200883692408.28 16757614123494.0 18655458148288.4 9749631002379.63 12408768756516.9 18151770812903.1 11812717855321.1 9394184512827.24 11900074041552.3 -7318477205747.72