hypre-space / hypre

Parallel solvers for sparse linear systems featuring multigrid methods.
https://www.llnl.gov/casc/hypre/
Other
699 stars 193 forks source link

Presumable bug in hypre_ParCSRMatrixGenerateFFFCHost leads to crash #1133

Closed horstsoft0815 closed 4 weeks ago

horstsoft0815 commented 1 month ago

For BoomerAMG with Extended+i Interpolation, the routine hypre_ParCSRMatrixGenerateFFFCHost called within hypre_BoomerAMGBuildModExtPEInterpHost at line par_mod_lr_interp.c:1366 leads to spurious values of "-1" in an index structure for "As_FF_diag" (par_mod_lr_interp.c:1374). This later causes a crash when hypre_ParMatmul is called at line par_mod_lr_interp.c:1661.

The "-1" values result from line gen_fffc.c:155, where they are first written to the array "fine_to_fine" and later written without check (only "CF_marker" is checked) into "A_FF_diag_j" at line gen_fffc.c:385. The problem is that in line gen_fffc.c:155 CF_marker[i] is checked (which is related to the sign of fine_to_fine[i]), but the value of fine_to_fine[A_diag_j[jA]] is accessed.

victorapm commented 1 month ago

@horstsoft0815 thanks for bringing this up.

Do you have a reproducer to your problem?

horstsoft0815 commented 1 month ago

Sorry it took me a while, this example should reproduce the issue. The CSR matrix is separated in the index vectors col_csr_dump.dat, colPtr_csr_dump.dat, and the data in val_csr_dump.dat. It is then loaded into the example hypre_interp.cpp. I tried to build the example in a similar fashion as used in our software. We have a pre-existing matrix, and dont want to copy it, so I hand the important ptrs into the regarding hypre structures and set the flags to non-owning. hypre_example.zip

victorapm commented 1 month ago

@horstsoft0815 which software are your working on? Is it supposed to work in parallel (MPI)?

horstsoft0815 commented 1 month ago

The software is a commercial prototype of a charging simulation for semiconductor industry. The bug was encountered without MPI/CUDA, despite HYPRE was compiled with CUDA support:

set(HYPRE_ENABLE_SHARED OFF)
set(HYPRE_ENABLE_BIGINT OFF)
set(HYPRE_ENABLE_MIXEDINT OFF)
set(HYPRE_ENABLE_SINGLE OFF)
set(HYPRE_ENABLE_LONG_DOUBLE OFF)
set(HYPRE_ENABLE_COMPLEX OFF)
set(HYPRE_ENABLE_HYPRE_BLAS ON)
set(HYPRE_ENABLE_HYPRE_LAPACK ON)
set(HYPRE_ENABLE_PERSISTENT_COMM OFF)
set(HYPRE_ENABLE_FEI OFF)
set(HYPRE_WITH_MPI OFF)
set(HYPRE_WITH_OPENMP OFF)
set(HYPRE_WITH_HOPSCOTCH OFF)
set(HYPRE_USING_DSUPERLU OFF)
set(HYPRE_USING_MAGMA OFF)
set(HYPRE_WITH_CALIPER OFF)
set(HYPRE_PRINT_ERRORS OFF)
set(HYPRE_TIMING OFF)
set(HYPRE_BUILD_EXAMPLES OFF)
set(HYPRE_BUILD_TESTS OFF)
set(HYPRE_USING_HOST_MEMORY OFF)
set(HYPRE_WITH_CUDA ON)
set(HYPRE_WITH_SYCL OFF)
set(HYPRE_ENABLE_UNIFIED_MEMORY OFF)
set(HYPRE_ENABLE_CUDA_STREAMS ON)
set(HYPRE_ENABLE_CUSPARSE ON)
set(HYPRE_ENABLE_DEVICE_POOL OFF)
set(HYPRE_ENABLE_CUBLAS ON)
set(HYPRE_ENABLE_CURAND ON)
set(HYPRE_ENABLE_GPU_PROFILING OFF)
set(HYPRE_ENABLE_ONEMKLSPARSE OFF)
set(HYPRE_ENABLE_ONEMKLBLAS OFF)
set(HYPRE_ENABLE_ONEMKLRAND OFF)
set(HYPRE_WITH_UMPIRE OFF)
set(HYPRE_WITH_UMPIRE_HOST OFF)
set(HYPRE_WITH_UMPIRE_DEVICE OFF)
set(HYPRE_WITH_UMPIRE_UM OFF)
set(HYPRE_WITH_UMPIRE_PINNED OFF)
victorapm commented 1 month ago

Thanks! Does the bug happen when compiling hypre without CUDA support?

horstsoft0815 commented 1 month ago

Yes it happens also when hypre is compiled without CUDA support.

victorapm commented 1 month ago

@horstsoft0815 there is something wrong with your conversion routines. I've updated your code to use the IJ interface for setting matrices and vectors (that's what we recommend doing). You can find the new implementation here: issue1133_solution.tar.gz

Please, let me know if this issue can be closed

Here's the output about the preconditioner:

$ ./hypre_interp_v2 

 Num MPI tasks = 1

 Num OpenMP threads = 1

BoomerAMG SETUP PARAMETERS:

 Max levels = 25
 Num levels = 6

 Strength Threshold = 0.250000
 Interpolation Truncation Factor = 0.000000
 Maximum Row Sum Threshold for Dependency Weakening = 0.900000

 Coarsening Type = PMIS 
 measures are determined locally

 No global partition option chosen.

Operator Matrix Information:

             nonzero            entries/row          row sums
lev    rows  entries sparse   min  max     avg      min         max
======================================================================
  0    2312    15062  0.003     4    7     6.5  -2.820e-14   1.204e+01
  1     891    23539  0.030     5   66    26.4  -2.191e-14   1.403e+01
  2     232    11524  0.214     6   87    49.7   1.507e-03   1.430e+01
  3      60     1888  0.524     3   48    31.5   7.857e-01   1.398e+01
  4      18      298  0.920    12   18    16.6   4.798e+00   1.915e+01
  5       3        9  1.000     3    3     3.0   8.657e+00   2.148e+01

Interpolation Matrix Information:
                    entries/row        min        max            row sums
lev  rows x cols  min  max  avgW     weight      weight       min         max
================================================================================
  0  2312 x 891     1   11   4.0   1.086e-02   8.649e-01   5.758e-01   1.000e+00
  1   891 x 232     1   13   5.2   5.073e-03   8.866e-01   4.117e-01   1.000e+00
  2   232 x 60      1   15   5.9   4.015e-03   8.868e-01   2.632e-01   1.000e+00
  3    60 x 18      1    9   4.8   2.704e-03   4.377e-01   1.774e-01   1.000e+00
  4    18 x 3       0    3   1.4   4.017e-03   1.286e-01   0.000e+00   1.000e+00

     Complexity:   grid = 1.520761
               operator = 3.473642
                 memory = 4.240274

BoomerAMG SOLVER PARAMETERS:

  Maximum number of cycles:         1 
  Stopping Tolerance:               0.000000e+00 
  Cycle type (1 = V, 2 = W, etc.):  1

  Relaxation Parameters:
   Visiting Grid:                     down   up  coarse
            Number of sweeps:            1    1     1 
   Type 0=Jac, 3=hGS, 6=hSGS, 9=GE:     18   18     9 
   Point types, partial sweeps (1=C, -1=F):
                  Pre-CG relaxation (down):   0
                   Post-CG relaxation (up):   0
                             Coarsest grid:   0

HYPRE_BiCGSTABGetPrecond got good preconditioner object
horstsoft0815 commented 1 month ago

@victorapm Thanks for your fix. The problem is that a matrix copy should be avoided by any means. So it would be great to find the original code in my bug when using the lower case "hyper_" variants. Is there any obvious problem with my version?

victorapm commented 1 month ago

@horstsoft0815 sorry, but I can't see any obvious problem.

victorapm commented 4 weeks ago

@horstsoft0815 I hope you have been able to find the issue in your code. I'm closing this for now since we found hypre_ParCSRMatrixGenerateFFFCHost is behaving correctly. Feel free to reach out if you have other issues