Closed wisecashew closed 1 year ago
Hi, can you please provide the code versions including how the LAMMPS executable was built?
Thank you for your response, @giacomofiorin! Yes, here it is:
#!/bin/bash
VERSION=29Sep2021
echo "deleting old tarball..."
rm stable_${VERSION}.tar.gz || true
echo "deleting old lammps build..."
rm -rf lammps-stable_${VERSION} || true
echo "now start grabbing tar file from the repo..."
wget https://github.com/lammps/lammps/archive/stable_${VERSION}.tar.gz
tar zxf stable_${VERSION}.tar.gz
cd lammps-stable_${VERSION}
mkdir build && cd build
module purge
module load intel/19.1.1.217
module load intel-mpi/intel/2019.7
cmake3 -D CMAKE_INSTALL_PREFIX=$HOME/.local.lammps.latest.w.accelrn \
-D CMAKE_BUILD_TYPE=Release \
-D LAMMPS_MACHINE=user_intel \
-D ENABLE_TESTING=yes \
-D BUILD_OMP=yes \
-D BUILD_MPI=yes \
-D CMAKE_C_COMPILER=icc \
-D CMAKE_CXX_COMPILER=icpc \
-D CMAKE_CXX_FLAGS_RELEASE="-Ofast -xHost -DNDEBUG" \
-D PKG_MOLECULE=yes -D PKG_RIGID=yes -D PKG_MISC=yes \
-D PKG_KSPACE=yes -D FFT=MKL -D FFT_SINGLE=yes \
-D PKG_EXTRA-MOLECULE=yes -D PKG_USER-INTEL=yes -D PKG_ASPHERE=yes -D PKG_CLASS2=yes -D PKG_OPENMP=yes -D PKG_OPT=yes -D PKG_EXTRA-DUMP=yes \
-D PKG_COLVARS=yes \
-D PKG_INTEL=yes -D INTEL_ARCH=cpu -D INTEL_LRT_MODE=threads ../cmake
make -j 16
make install
I have attached my LAMMPS executable script (with CMAKE) to this message: stellar_intel_lammps_user_intel.sh.txt
I have added a .txt
extension just so it could be pasted here.
FWIW, when I run with valgrind using the 2Aug2023 version of LAMMPS I get:
==196831== Conditional jump or move depends on uninitialised value(s)
==196831== at 0x7F9A390: colvar::periodic_boundaries(colvarvalue const&, colvarvalue const&) const (colvar.cpp:2158)
==196831== by 0x8087BED: colvar_grid<unsigned long>::init_from_colvars(std::vector<colvar*, std::allocator<colvar*> > const&, unsigned long, bool) [clone .isra.0] (colvargrid.h:299)
==196831== by 0x8088ECB: colvar_grid (colvargrid.h:258)
==196831== by 0x8088ECB: colvar_grid_count::colvar_grid_count(std::vector<colvar*, std::allocator<colvar*> >&, unsigned long const&, bool) (colvargrid.cpp:37)
==196831== by 0x7FDBEEF: colvarbias_abf::init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (colvarbias_abf.cpp:193)
==196831== by 0x8097040: parse_biases_type<colvarbias_abf> (colvarmodule.cpp:497)
==196831== by 0x8097040: colvarmodule::parse_biases(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (colvarmodule.cpp:523)
==196831== by 0x8099CCB: colvarmodule::parse_config(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (colvarmodule.cpp:278)
==196831== by 0x809A188: colvarmodule::read_config_file(char const*) (colvarmodule.cpp:210)
==196831== by 0x80BA370: colvarproxy::parse_module_config() (colvarproxy.cpp:531)
==196831== by 0x60AC9D8: LAMMPS_NS::FixColvars::one_time_init() (fix_colvars.cpp:448)
==196831== by 0x60AD030: LAMMPS_NS::FixColvars::setup(int) (fix_colvars.cpp:519)
==196831== by 0x5DD0487: LAMMPS_NS::Modify::setup(int) (modify.cpp:310)
==196831== by 0x5F55D89: LAMMPS_NS::Verlet::setup(int) (verlet.cpp:159)
==196831== Uninitialised value was created by a heap allocation
==196831== at 0x4841FB5: operator new(unsigned long) (vg_replace_malloc.c:472)
==196831== by 0x809585A: colvarmodule::parse_colvars(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (colvarmodule.cpp:422)
==196831== by 0x8099CB2: colvarmodule::parse_config(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (colvarmodule.cpp:273)
==196831== by 0x809A188: colvarmodule::read_config_file(char const*) (colvarmodule.cpp:210)
==196831== by 0x80BA370: colvarproxy::parse_module_config() (colvarproxy.cpp:531)
==196831== by 0x60AC9D8: LAMMPS_NS::FixColvars::one_time_init() (fix_colvars.cpp:448)
==196831== by 0x60AD030: LAMMPS_NS::FixColvars::setup(int) (fix_colvars.cpp:519)
==196831== by 0x5DD0487: LAMMPS_NS::Modify::setup(int) (modify.cpp:310)
==196831== by 0x5F55D89: LAMMPS_NS::Verlet::setup(int) (verlet.cpp:159)
==196831== by 0x5EE753E: LAMMPS_NS::Run::command(int, char**) (run.cpp:171)
==196831== by 0x5D3DA4C: LAMMPS_NS::Input::execute_command() (input.cpp:868)
==196831== by 0x5D3E67D: LAMMPS_NS::Input::file() (input.cpp:313)
This can be easily silenced by this change:
diff --git a/lib/colvars/colvar.cpp b/lib/colvars/colvar.cpp
index 700d3752ac..0cb5c1ebdb 100644
--- a/lib/colvars/colvar.cpp
+++ b/lib/colvars/colvar.cpp
@@ -30,6 +30,7 @@ colvar::colvar()
after_restart = false;
kinetic_energy = 0.0;
potential_energy = 0.0;
+ period = 0.0;
#ifdef LEPTON
dev_null = 0.0;
but I do not get a segmentation fault before or after this change.
Thanks for the quick diagnosis @akohlmey!
Interestingly, this is one of the very oldest classes and the missing initialization went undetected all this time.
I am running a free energy calculation on Rg for a polymer in water in LAMMPS using the COLVAR package. It is an NPT simulation with intel acceleration with an ABF acting on Rg.
I am seeing an error which seems to take place AFTER the simulation is done running. I don’t understand why this ought to happen. I have attached my simulation output. This is the final output message:
You can see this in the file
npt.out
.As you can see, LAMMPS has also reported the total run time, so I assume the simulation has run its course, but then crashes out right after. What could be causing this? I am running the following command on my cluster:
srun --ntasks=96 --nodes=1 --cpus-per-task=1 --exclusive lmp_colvar -sf intel -in npt.in > npt.out 2>&1
.where sys.npt.data is my data file, sys.pnipam.water.settings is my settings file, colvars.inp is my colvars input file, and npt.in is my LAMMPS input file. I have attached all my input files to this message. I would appreciate any advice you have for me.
colvars_inputs.zip