Closed connoraird closed 3 months ago
If you use "-O3" (defined in system/system.*.make) in compiling, could you try with "-g" ?
In my case, I also encountered an error though it looks different.
But, I don't have this problem with -g.
Can you give us a little more information about the setup, compilation and output, please? For reference, I compiled with GCC13.2 and OpenMPI 4.1.6 on a Mac (running Ventura 13.6.3) with compilers installed via homebrew, using -O3 and linking against FFTW v3.3.10 and LibXC v6.2.2 (also from home-brew). I used the current version of Conquest on the develop branch (output gives Version comment: Git Branch: develop; tag, hash: v1.2-156-ge1759e68
). I ran on one process (mpirun -np 1
) and found no problems running.
Can you attach your input files, system.make, and output file as well as giving details of how you run if you want more help?
Thanks for the help, I've tried replicating your setup @davidbowler. I have checked out the most recent develop branch (commit e1759e68c19649eccc8fe1098ee0714f9c628347). I am using GCC13.2 installed via homebrew and I was using openmpi 5.0.1 installed via homebrew but I've now also tried openmpi 4.1.6 installed with MacPorts. Both have resulted in the same error stated above. I am running the command mpirun -np 1 ../../bin/Conquest
from within testsuite/test_001_bulk_Si_1proc_Diag
. My system.*.make file is printed below.
# This is an system-specific makefile for my local system. You will need to adjust
# it for the actual system you are running on.
# Set compilers
FC=mpif90
F77=mpif77
# OpenMP flags
# Set this to "OMPFLAGS= " if compiling without openmp
# Set this to "OMPFLAGS= -fopenmp" if compiling with openmp
OMPFLAGS= -fopenmp
# Compilation flags
# NB for gcc10 you need to add -fallow-argument-mismatch
COMPFLAGS= -O3 $(OMPFLAGS) $(XC_COMPFLAGS) -fallow-argument-mismatch
COMPFLAGS_F77= $(COMPFLAGS)
# Set BLAS and LAPACK libraries
# MacOS X
BLAS= -lvecLibFort
# Intel MKL use the Intel tool
# Generic
# BLAS= -llapack -lblas
# LibXC: choose between LibXC compatibility below or Conquest XC library
# Conquest XC library
#XC_LIBRARY = CQ
#XC_LIB =
#XC_COMPFLAGS =
# LibXC compatibility
# Choose LibXC version: v4 (deprecated) or v5/6 (v5 and v6 have the same interface)
#XC_LIBRARY = LibXC_v4
XC_LIBRARY = LibXC_v5
XC_LIB = -lxcf90 -lxc
XC_COMPFLAGS = -I/usr/local/include -I/opt/local/include -I/opt/homebrew/include -I/opt/homebrew/Cellar/libxc/6.2.2/include -I/opt/homebrew/Cellar/fftw/3.3.10_1/include
# Set FFT library
FFT_LIB=-lfftw3
FFT_OBJ=fft_fftw3.o
# Full library call; remove -lscalapack if using dummy diag module.
# If using OpenMPI, use -lscalapack-openmpi instead.
LIBS= $(FFT_LIB) $(XC_LIB) -lscalapack $(BLAS)
# Linking flags
LINKFLAGS= -L/usr/local/lib -L/opt/local/lib -L/opt/homebrew/lib -L/opt/homebrew/Cellar/libxc/6.2.2/lib -L/opt/homebrew/Cellar/fftw/3.3.10_1/lib $(OMPFLAGS)
ARFLAGS=
# Matrix multiplication kernel type
MULT_KERN = default
# Use dummy DiagModule or not
DIAG_DUMMY =
# Use dummy omp_module or not.
# Set this to "OMP_DUMMY = DUMMY" if compiling without openmp
# Set this to "OMP_DUMMY = " if compiling with openmp
OMP_DUMMY =
That's very odd @connoraird ! Can you upload the output file (if it is produced)?
Sure, The Conquest_out file is as follows
________________________________________________________________________
CONQUEST
Concurrent Order N QUantum Electronic STructure
________________________________________________________________________
Conquest lead developers:
D.R.Bowler (UCL, NIMS), T.Miyazaki (NIMS), A.Nakata (NIMS),
L. Truflandier (U. Bordeaux)
Developers:
M.Arita (NIMS), J.S.Baker (UCL), V.Brazdova (UCL), R.Choudhury (UCL),
S.Y.Mujahed (UCL), J.T.Poulton (UCL), Z.Raza (NIMS), A.Sena (UCL),
U.Terranova (UCL), L.Tong (UCL), A.Torralba (NIMS)
Early development:
I.J.Bush (STFC), C.M.Goringe (Keele), E.H.Hernandez (Keele)
Original inspiration and project oversight:
M.J.Gillan (Keele, UCL)
________________________________________________________________________
Simulation cell dimensions: 10.3600 a0 x 10.3600 a0 x 10.3600 a0
Atomic coordinates (a0)
Atom X Y Z Species
1 0.0104 0.0207 0.0311 1
2 5.1800 5.1800 0.0000 1
3 5.1800 0.0000 5.1800 1
4 0.0000 5.1800 5.1800 1
5 2.5900 2.5900 2.5900 1
6 7.7700 7.7700 2.5900 1
7 2.5900 7.7700 7.7700 1
8 7.7700 2.5900 7.7700 1
Using a MP mesh for k-points: 2 x 2 x 2 G
This job was run on 2024/01/11 at 10:46 +0000
Code was compiled on 2024/01/11 at 10:34 +0000
Version comment: Git Branch: develop; tag, hash: v1.2-156-ge1759e68
Job title:
Job to be run: static calculation
Ground state search:
Support functions represented with PAO basis
1:1 PAO to SF mapping
Non-spin-polarised electrons
Solving for the K matrix using diagonalisation
Integration grid spacing: 0.288 a0 x 0.288 a0 x 0.288 a0
Number of species: 1
--------------------------------------------------------
| # mass (au) Charge (e) SF Rad (a0) NSF Label |
--------------------------------------------------------
| 1 28.086 4.000 0.576 9 Si |
--------------------------------------------------------
The calculation will be performed on 1 process
The calculation will be performed on 8 threads
Using the default matrix multiplication kernel
The functional used will be GGA PBE96
Error in process 1
make_halo: no. of atoms in halo must be .ge. 1
There is a subtle issue with GCC13 we have been chasing intermittently which might cause this. Can you try compiling pseudo_tm_info.f90
without any optimisation: mpif90 -g -c pseudo_tm_info.f90
and then remake the rest of the code and see if that helps? Also worth trying with one thread only (or even without OpenMP).
After compiling pseudo_tm_info.f90
and remaking, as you said, and running ./run_conquest_tests.sh
I get the following
Running tests on 1 processes and 1 threads
Building on SYSTEM local, using makefile system/system.local.make
make: Nothing to be done for `default'.
Error in process 1
We need Conquest_input to run !
Error in process 1
We need Conquest_input to run !
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
Proc: [[40839,0],0]
Errorcode: 1
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
Error in process 1
We need Conquest_input to run !
Error in process 1
We need Conquest_input to run !
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
Proc: [[31203,0],0]
Errorcode: 1
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
============================================================================================================================= test session starts ==============================================================================================================================
platform darwin -- Python 3.11.6, pytest-7.4.3, pluggy-1.3.0
rootdir: /Users/connoraird/work/conquest/CONQUEST-release/testsuite
plugins: anyio-4.0.0
collected 12 items
test_check_output.py ............ [100%]
============================================================================================================================== 12 passed in 0.12s ==============================================================================================================================
My issue has been resolved through @davidbowler's suggestion of compiling pseudo_tm_info.f90
without optimisations, before remaking.
The other errors displayed above where due to directories I had failed to cleanup before running the tests.
Related to #289
Fixed by #302
After compiling with gcc version 13 on Mac I get the following error running test_001...
There may be other issues with my setup. However, we thought it best to raise an issue anyway.