Segmentation Fault, mkl_sparse_d_mv, real symmetric lanczos

Enlil50 commented 10 months ago

Expected behavior

I don't know if it's the right place to ask this, but since this place is more familiar with arpack-ng (obviously) than mkl community, this is probably the best place.

I have a C++ project made with Spectra. A colleague of mine has done it on arpack-ng, fortran, with mkl and intel compilers. The eigenvalues differ from the 7th decimal value, for doubles.

I'm trying to use this library to make the comparison myself. I have my colleague project and, even if written differently, the sections which give me error are pretty much the same.

Actual behavior

I compared row_arr, col_arr and val_arr with the matrix printed from C++ and they are the same, so it should not be an issue in how i wrote the algorithm to initialize the matrix.

There is no issue warning until mkl_sparse_d_mv, so mkl_sparse_d_create_coo and dsaupd give no problems.

Where/how to reproduce the problem

CMakeLists.txt:

# copy paste the codes below when you are in bin.
# cmake ../ -B ../build -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_CXX_COMPILER=icpx
# cmake --build ../build

cmake_minimum_required(VERSION 3.24)
project(modulo_02 LANGUAGES CXX Fortran)

set(executable_and_libraries your/path/to/dir)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${executable_and_libraries}/bin)

#   %---------------%
#   |  MKL FORTRAN  |
#   %---------------%

set(MKL_THREADING sequential)
set(MKL_LINK static)
set(MKLROOT /opt/intel/oneapi/mkl/2024.0)
find_package(MKL CONFIG REQUIRED PATHS ${MKLROOT})

#First compile mk_spblas.f90 in an object mk_spblas.o file you can use to link to all the other libraries
#I compiled it directly from terminal, knowing the exact location of MKLROOT:
#N.B: YOU NEED THE MOD FILE TO BE IN THE BUILD DIRECTORY fo this reason I added the module flag.
#ifort -fpp -I/opt/intel/oneapi/mkl/2024.0/include -w /opt/intel/oneapi/mkl/2024.0/include/mkl_spblas.f90 -c -o ../src_external/mkl_spblas.o -Ofast -i8 -static -module ../build

#Home made Fortran module
add_library(Quantum_Ising_1D STATIC)
target_sources(Quantum_Ising_1D PUBLIC
    ${executable_and_libraries}/src_homemade/Quantum_Ising_1D.f90
)
set_target_properties(Quantum_Ising_1D PROPERTIES
    COMPILE_FLAGS "-Ofast -i8"   
)

#Main program Fortran
add_executable(main_program_fortran 
    ${executable_and_libraries}/bin/main_program_fortran.f90
)
target_link_libraries(main_program_fortran PUBLIC 
    Quantum_Ising_1D

    ${executable_and_libraries}/src_external/arpack-ng/build/libarpackILP64.a
    ${executable_and_libraries}/src_external/mkl_spblas.o
    ${MKLROOT}/lib/intel64/libmkl_blas95_ilp64.a
    ${MKLROOT}/lib/intel64/libmkl_lapack95_ilp64.a
    $<LINK_GROUP:RESCAN, ${MKLROOT}/lib/libmkl_intel_ilp64.a, ${MKLROOT}/lib/libmkl_sequential.a, ${MKLROOT}/lib/libmkl_core.a, ${MKLROOT}/lib/libmkl_blacs_intelmpi_ilp64.a >
)
target_include_directories(main_program_fortran PUBLIC
    ${MKLROOT}/include/mkl/intel64/ilp64 BEFORE
    ${MKLROOT}/include AFTER
)
set_target_properties(main_program_fortran PROPERTIES
    COMPILE_FLAGS "-Ofast -i8"
    #COMPILE_FLAGS "-fpp" #needed for preprocessing. It's needed just when compiling the actual library which uses it.
    LINK_FLAGS "-lpthread -lm -ldl -static" #if the libraries are dynamic, then you need to add link flags just to the executable
)

Main program:

program main_program_fortran
    use Quantum_Ising_1D
    implicit none

    integer(8) :: number_sites, nev, ncv
    real(8) :: trans_magnet_field, long_magnet_field !, long_magnetiz, trans_magnetiz
    real(8), allocatable :: eval_arr(:), evec_matrix(:,:)

    number_sites = 5
    nev = 1
    ncv = 3

    long_magnet_field = 0
    trans_magnet_field = 1.1

    print *, 'IF YOU COMPILE WITH INTEL, TRUE=-1.'
    print *, 'THIS PROGRAM RELIES ON INTEL UGLY CONVENTION'
    print *, '  '

    call Initialize_Quantum_Ising(number_sites, nev, ncv)
    call Initialize_Hamiltonian(long_magnet_field, trans_magnet_field)

    allocate(eval_arr(nev))
    allocate(evec_matrix(tot_Hilbert_elem, nev))

    call diag_lanczos('I', 'SA', eval_arr, evec_matrix)

end program main_program_fortran

My Module (Initialization, diagonalization):

module Quantum_Ising_1D
    use mkl_spblas
    implicit none
    private
    public Initialize_Quantum_Ising, Initialize_Hamiltonian, diag_lanczos, tot_Hilbert_elem

    integer, parameter :: number_spin_states = 2
    integer, parameter :: dimensions = 1
    integer, parameter :: boundary_conditions_flag = 1
    integer :: number_sites, total_sites, tot_Hilbert_elem, nev, ncv, ldv, nnz
    logical, allocatable :: comput_base(:,:)

    type(sparse_matrix_T) :: sp_matrix
    type(MATRIX_DESCR) :: descr_sp_matrix !caratteristiche della matrice, come sym, hermitiana, ecc

    ! FOR ALL THE FOLLOWING LOCAL NAMES, THEY ARE LEFT AS IN THE ORIGINAL DOCUMENTATION
    ! Allocations are done in initialize_comput_base

    !     %------------------------------------------------------%
    !     | NCV is the largest number of basis vectors that will |
    !     |     be used in the Implicitly Restarted Arnoldi      |
    !     |     Process.  Work per major iteration is            |
    !     |     proportional to N*NCV*NCV.                       |
    !     |                                                      |
    !     |NEV is the number of eigenvalues                      |
    !     %------------------------------------------------------%

    !     %------------------------------------------------------%
    !     |  Note: NEV and NCV must satisfy the following        |
    !     | conditions:                                          |
    !     |                  NEV + 1 <= NCV                      |
    !     %------------------------------------------------------%

    !     %--------------%
    !     | Local Arrays |
    !     %--------------%

    Real(8), allocatable :: lancz_basis_matrix(:,:), workl(:), workd(:), resid(:), ax(:)
    logical, allocatable :: select(:)
    integer :: iparam(11), ipntr(11)

    !     %---------------%
    !     | Local Scalars |
    !     %---------------%

    integer(4) :: previous_file_digits
    integer :: ido, length_workl, info, ierr, j, ishfts, maxitr, mode1, nconv, itry
    Real(8) :: tol, sigma, rnorm    !tol(arpack.ng) == tol(Spectra C++), 
                                    !maxitr(arpack.ng) == maxit(Spectra C++)

    !     %------------%
    !     | Parameters |
    !     %------------%

    Real(8), parameter :: zero=0.0d+0  !machine precision, will be used for tol

    !     %-----------------------------%
    !     | BLAS & LAPACK routines used |
    !     %-----------------------------%   

    !   %------------------------------------------------------------%
    !   |                       Routines called:                     |
    !   | dsaupd  ARPACK reverse communication interface routine.    |
    !   | dseupd  ARPACK routine that returns Ritz values            |
    !   |       and (optionally) Ritz vectors.                       |
    !   | dnrm2   Level 1 BLAS that computes the norm of a vector.   |
    !   | daxpy   Level 1 BLAS that computes y <- alpha*x+y.         |
    !   %------------------------------------------------------------%
    Real(8) dnrm2  
    external dnrm2, daxpy, dseupd, dsaupd, dcopy, dgetv0

    contains

    subroutine Initialize_Quantum_Ising(number_sites_external, nev_external, ncv_external)
        implicit none
        integer, intent(in) :: number_sites_external, nev_external, ncv_external
        integer :: i_row, j_col, i_holder ! j_holder

        number_sites = number_sites_external
        nev = nev_external
        ncv = ncv_external

        total_sites = number_sites*dimensions
        tot_Hilbert_elem = number_spin_states**total_sites
        ldv = tot_Hilbert_elem
        nnz = tot_Hilbert_elem*(total_sites+1)

        !Inizializing all indipendent configuration of Hilbert Space.

        allocate(comput_base(tot_Hilbert_elem, total_sites)) !! SAME AS C++
        i_holder = 0
        do i_row = 1, tot_Hilbert_elem
            i_holder = i_row - 1
            do j_col = 1, total_sites
                comput_base(i_row, j_col) = MOD(i_holder,2)
                i_holder = i_holder/2
            end do
        end do

    !     %-------------------------%
    !     | Local Arrays allocation |
    !     %-------------------------%

        allocate(lancz_basis_matrix(ldv,ncv)) 
        length_workl = ncv*(ncv+8)
        allocate(workl(length_workl))
        allocate(workd(3*tot_Hilbert_elem))
        allocate(resid(tot_Hilbert_elem))
        allocate(ax(tot_Hilbert_elem))
        allocate(select(ncv)) 

    end subroutine Initialize_Quantum_Ising

    subroutine Initialize_Hamiltonian(long_magnet_field, trans_magnet_field)
        implicit none
        real(8), intent(in) :: long_magnet_field, trans_magnet_field
        integer :: i_row, j_col !indices for computational base we are scrolling through.
        integer :: new_i_row !index for non diagonal elements of sp_matrix.
        real(8) :: val_arr(nnz)
        integer(8) :: col_arr(nnz), row_arr(nnz)
        integer :: index_holder
        integer :: stat           !If altered an error occurred
        stat = 0

        descr_sp_matrix % TYPE = sparse_matrix_TYPE_SYMMETRIC            !!! col .geq. row
        descr_sp_matrix % Mode = SPARSE_FILL_MODE_UPPER
        descr_sp_matrix % diag = SPARSE_DIAG_NON_UNIT

    ! %---------------------------------------------------------%
    ! | The following array order will be:                      |
    ! | diagonal elements as first tot_Hilbert_elem             |
    ! | non diagonal as the other ones                          |
    ! %---------------------------------------------------------%

        ! fortran is a column major language

        !   %---------------------------------------%
        !   |   (Sigma_(z) * Sigma_(z+1)) coupling  |
        !   %---------------------------------------%

        do i_row=1, tot_Hilbert_elem
            col_arr(i_row) = i_row
            row_arr(i_row) = i_row
            do j_col=1, total_sites-1

                if (comput_base(i_row, j_col) == comput_base(i_row, j_col+1)) then
                    val_arr(i_row) = - 1.d0
                else
                    val_arr(i_row) = + 1.d0
                end if

            end do
        end do

        if (boundary_conditions_flag==1) then
            do i_row=1, tot_Hilbert_elem
                ! col_arr(i_row) = i_row
                ! row_arr(i_row) = i_row

                if (comput_base(i_row, total_sites) == comput_base(i_row, 1)) then
                    val_arr(i_row) = val_arr(i_row) - 1.d0
                else
                    val_arr(i_row) = val_arr(i_row) + 1.d0
                end if

            end do
        end if

        index_holder =  tot_Hilbert_elem !we have filled all diagonal elements

        !   %---------------------%
        !   |   Transverse Field  |
        !   %---------------------%

        do i_row=1, tot_Hilbert_elem
            do j_col=1, total_sites
                index_holder = index_holder + 1
                new_i_row = i_row + (1+2*comput_base(i_row, j_col)) * 2**(j_col-1)

                col_arr(index_holder) = i_row
                row_arr(index_holder) = new_i_row
                val_arr(index_holder) = - trans_magnet_field ! val_arr(index_holder) - trans_magnet_field they do not overlap
            end do
            ! if comput_base(i_row, j_col) == 1 then
            !     new_i_row = i_row - 2**(j_col-1)
            ! else 
            !     new_i_row = i_row + 2**(j_col-1)
            ! end if
        end do     

        stat = mkl_sparse_d_create_coo(sp_matrix, SPARSE_INDEX_BASE_ONE, tot_Hilbert_elem, tot_Hilbert_elem,&
        nnz, row_arr, col_arr, val_arr)
        if (stat .ne. 0) print*, 'error in sparse matrix creation: ', stat   

    end subroutine Initialize_Hamiltonian

    subroutine diag_lanczos(Bmat, which_eval, eval_array, evec_matrix, v0)
        implicit none

    !   %---------------------------------------%
    !   |           A*x=lambda*Bmat*x           |
    !   | where x is a vector, lambda the       |
    !   | eigenvalue to search for, and B=I.    |
    !   |                                       |
    !   | 'I' specifies that the problem to     |
    !   | solve is a standard Eigenvalues       |
    !   | problem.                              |
    !   |                                       |
    !   |              which_evals              |
    !   | 'SA' specifies that we are searching  |
    !   | for the Smallest Algebric eigenvalues.|
    !   | 'LM' for Larges Magnitude.            |
    !   %---------------------------------------%

        character,intent(in) :: Bmat(1), which_eval(2)

        real(8),intent(out) :: eval_array(nev)                            !array for eigenvalues
        real(8),intent(out),optional :: evec_matrix(tot_Hilbert_elem,nev) !1 array for each eigenvector --> matrix
        real(8),intent(in),optional :: v0(tot_Hilbert_elem)               !initial vector for lanczos

        integer :: stat
        logical :: rvec = .false.   !Do you want to compute eigenvectors too?
        ido = 0                     !reverse communication variable
        tol = zero                  !if zero, machine precision is used
        stat = 0                    !final state of computation. If altered an error occurred

        if (present(v0)) then 
            info=1  
            resid=v0
        else
            info=0
        end if

        if (present(evec_matrix)) rvec=.true.

        !     %-----------------------------------------------------%
        !     |                                                     |
        !     | Specification of stopping rules and initial         |
        !     | conditions before calling SSAUPD                    |
        !     |                                                     |
        !     | TOL  determines the stopping criterion.             |
        !     |                                                     |
        !     |      Expect                                         |
        !     |           abs(lambdaC - lambdaT) < TOL*abs(lambdaC) |
        !     |               computed   true                       |
        !     |                                                     |
        !     |      If TOL .le. 0,  then TOL <- macheps            |
        !     |           (machine precision) is used.              |
        !     |                                                     |
        !     | IDO  is the REVERSE COMMUNICATION parameter         |
        !     |      used to specify actions to be taken on return  |
        !     |      from SSAUPD. (See usage below.)                |
        !     |                                                     |
        !     |      It MUST initially be set to 0 before the first |
        !     |      call to SSAUPD.                                |
        !     |                                                     |
        !     | INFO on entry specifies starting vector information |
        !     |      and on return indicates error codes            |
        !     |                                                     |
        !     |      Initially, setting INFO=0 indicates that a     |
        !     |      random starting vector is requested to         |
        !     |      start the ARNOLDI iteration.  Setting INFO to  |
        !     |      a nonzero value on the initial call is used    |
        !     |      if you want to specify your own starting       |
        !     |      vector (This vector must be placed in RESID.)  |
        !     |                                                     |
        !     | The work array WORKL is used in SSAUPD as           |
        !     | workspace.  Its dimension length_workl is set as    |
        !     | illustrated below.                                  |
        !     |                                                     |
        !     %-----------------------------------------------------%

        !     %---------------------------------------------------%
        !     | Specification of Algorithm Mode:                  |
        !     |                                                   |
        !     | This program uses the exact shift strategy        |
        !     | (indicated by setting PARAM(1) = 1).              |
        !     | IPARAM(3) specifies the maximum number of Arnoldi |
        !     | iterations allowed.  Mode 1 of SSAUPD is used     |
        !     | (IPARAM(7) = 1). All these options can be changed |
        !     | by the user. For details see the documentation in |
        !     | SSAUPD.                                           |
        !     %---------------------------------------------------%

        ishfts = 1
        maxitr = 1000 !max iteration
        mode1 = 1

        iparam(1) = ishfts
        iparam(3) = maxitr
        iparam(7) = mode1

        !     %------------------------------------------------%
        !     | M A I N   L O O P (Reverse communication loop) |
        !     %------------------------------------------------%

        !        %---------------------------------------------%
        !        | Repeatedly call the routine DSAUPD and take |
        !        | actions indicated by parameter IDO until    |
        !        | either convergence is indicated or maxitr   |
        !        | has been exceeded.                          |
        !        %---------------------------------------------%

        do while (ido /= 99)

            call dsaupd ( ido, Bmat, tot_Hilbert_elem, which_eval, nev, tol, resid, ncv, &
            lancz_basis_matrix, ldv, iparam, ipntr, workd, workl, length_workl, info )

            if (ido .eq. -1 .or. ido .eq. 1) then

                !       %--------------------------------------%
                !       | Perform matrix vector multiplication |
                !       |              y <--- OP*x             |
                !       | The user should supply his/her own   |
                !       | matrix vector multiplication routine |
                !       | here that takes workd(ipntr(1)) as   |
                !       | the input, and return the result to  |
                !       | workd(ipntr(2)).                     |
                !       %--------------------------------------%

                print *, 'THIS IS PRINTED'

                stat = mkl_sparse_d_mv(SPARSE_OPERATION_NON_TRANSPOSE, 1.d0, sp_matrix, &
                descr_sp_matrix, workd(ipntr(1)), 0.d0, workd(ipntr(2))) 

                print *, 'THIS IS NOT PRINTED'

                if (stat .ne. 0) print*, 'error mkl_sparse_d_mv: ', stat

                !       %-----------------------------------------%
                !       | L O O P   B A C K to call DSAUPD again. |
                !       %-----------------------------------------%

            end if
        end do

        !     %----------------------------------------%
        !     | Either we have convergence or there is |
        !     | an error.                              |
        !     %----------------------------------------%

        if ( info .lt. 0 ) then
            print*, 'error in diagonalization dsaupd', info
            stop
        endif

        !        %-------------------------------------------%
        !        | No fatal errors occurred.                 |
        !        | Post-Process using DSEUPD.                |
        !        |                                           |
        !        | Computed eigenvalues may be extracted.    |
        !        |                                           |
        !        | Eigenvectors may be also computed now if  |
        !        | desired.  (indicated by rvec = .true.)    |
        !        |                                           |
        !        | The routine DSEUPD now called to do this  |
        !        | post processing (Other modes may require  |
        !        | more complicated post processing than     |
        !        | mode1.)                                   |
        !        |                                           |
        !        %-------------------------------------------%

        !ALL computes NEV eigenvector, "S" computes only the ones specified by the array select
        !The firs v contains the vector B-Orthonormal in case of Generalized eigenvalues

        call dseupd ( rvec, 'All', select, eval_array, lancz_basis_matrix, ldv, sigma,&
        &         Bmat, tot_Hilbert_elem, which_eval, tol, resid, ncv, lancz_basis_matrix, ldv,&
        &         iparam, ipntr, workd, workl, length_workl, ierr ) 

        !         %----------------------------------------------%
        !         | Eigenvalues are returned in the first column |
        !         | of the two dimensional array D and the       |
        !         | corresponding eigenvectors are returned in   |
        !         | the first NCONV (=IPARAM(5)) columns of the  |
        !         | two dimensional array V if requested.        |
        !         | Otherwise, an orthogonal basis for the       |
        !         | invariant subspace corresponding to the      |
        !         | eigenvalues in D is returned in V.           |
        !         %----------------------------------------------%

        if ( ierr .ne. 0) then
            print*, 'error in diagonalization dseupd', ierr
            stop
        endif

        nconv =  iparam(5)

    end subroutine diag_lanczos

end module Quantum_Ising_1D

Steps to reproduce the problem

Install mkl Install arpack-ng put your directory in CmakeLists.txt Read CMakeLists.txt comments (you need to compile mk_spblas.f90) run CMakeLists.txt with all the programs in the rihgt folders (or just change the paths, easy and clear)

Error message

forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source
main_program_fort 0000000000F1D1C0 Unknown Unknown Unknown main_program_fort 00000000007E59EE Unknown Unknown Unknown main_program_fort 000000000041C63B Unknown Unknown Unknown main_program_fort 00000000004144FF Unknown Unknown Unknown main_program_fort 0000000000413D56 Unknown Unknown Unknown main_program_fort 0000000000403124 Unknown Unknown Unknown main_program_fort 00000000004029A8 Unknown Unknown Unknown main_program_fort 000000000040259D Unknown Unknown Unknown main_program_fort 0000000000F13EDA Unknown Unknown Unknown main_program_fort 0000000000F15777 Unknown Unknown Unknown main_program_fort 0000000000402475 Unknown Unknown Unknown

To be fair there even is another one, I could add here, given the exact same subject:

ifort: warning #10315: specifying -lm before files may supersede the Intel(R) math library and affect performance ld: /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.a(mkl_memory_patched.o): in function `mkl_serv_set_memory_limit': mkl_memory.c:(.text+0x5d1): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking [100%] Built target main_program_fortran

This issue is quite strange, as -lm is the recommended compile flag from the intel advisory. My colleague uses make, instead of cmake. He says it gives him no problem if you add it after adding the file. I specified it as a compile flag, so it should not give me errors of any sort.

sylvestre commented 10 months ago

could you please provide a reduced test case

Enlil50 commented 10 months ago

most of the module is the matrix initialization and big comments, so you could just skip them. However deleting it gives obviously no matrix.

I removed all the useless subroutine from it and the CmakeLists.txt file is obviously needed. The main program is minimal.

If you need a different matrix I could spend some time to create a new one, without a few lines, but it doesn't change that much.

Let me know

P.S: I removed a hundred of lines or so by simplifying the matrix and removing some physics comments

fghoussen commented 10 months ago

forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source main_program_fort 0000000000F1D1C0 Unknown Unknown Unknown

Can you report stack with debug symbol? Or follow up the stack with gdb to chek if args passed at your side are the same as the ones that MLK see as inputs.

MKL: did you link using the exact options specified here https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html, add wng-as-err at link time if possible, and, make sure to double-check cmake does indeed uses this exact expected link line (make VERBOSE=1)? If not, the problem is likely a MKL one so you can close this issue. You'll find an example of using MKL here: https://github.com/opencollab/arpack-ng/blob/0eaf38bd51efd3046f0a7e61a08ddcd7ed1f1fcc/.github/workflows/jobs.yml#L153

Enlil50 commented 10 months ago

I solved the issue. I created the triplet arrays locally and they were deallocated automatically when coming out from the scope.

I missed the fact that the sparse matrix still used them (and that's obvious, as no one would like to copy array internally)

It was a fault on my side. Thanks

opencollab / arpack-ng