Closed Xiaohui-Yang closed 3 years ago
Dear Xiaohui,
Thanks for the detailed info! What if you try removing the "-qopenmp" flag in LINKFLAGS as well?
I have always compiled my fix without OpenMP, and this set of errors means that my code almost certainly has OpenMP bugs in it. I hope you will be able to help me test the code once those bugs have been fixed!
All the best, Shern
Hi Shern,
I've tried remove both user-intel and user-omp packages, and removed two pppm_conp_intel. files, then removed the "-qopenmp" flag in the make file, but it doesn't work. Same errors with many core. files.
Cheers, Xiaohui
Dear Xiaohui,
I have now pushed a fix which should allow it to compile with OpenMP.
Could you let me know what input files you are testing that results in the core files, and what kind of parallelization if any you are using with it (MPI, OpenMP, etc)?
Hi Shern,
I just use the input file in the ./test/dilute folder. As for the parallelization, actually, I am not sure about this. I know little about the MPI things. All I know is that we use IntelMPI, so I type "CC = mpiicpc " in my make file.
Hope the above info could help you.
Cheers, Xiaohui
Hi Xiaohui,
Did you try running the test like this:
export N=1; lmp -i input
?
Thanks, Shern
Hi Shern,
I think I did the same thing? I changed the first line in the input file to "variable n equal 1".
Also, I tested my own works. Initially, the code runs perfectly, then I added:
"fix conp slab conp 1 1.979 1 2 0 0 inv conp.log pppm etypes 5 6 7 8 9 10"
Cheers, Xiaohui
I think now I understand what you mean, I'll try to remove all MPI things and compile again. Then use one core to run it.
Cheers, Xiaohui
Hi Shern,
I tested the ./test/dilute/input file with the default "serial" makefile (of course with blas and lapacks) with the gcc/7.4.0 compilier, and it works well.
However, I've tried your latest code (I copied all *.cpp/h files into /src dir., added the user-intel package but no-user-OMP) with intel compilier and intelmpi, the error still exists.
Next, I removed these two ppm_conp_intel.* files, and remove user-intel package and recompile (intel compilier and intelmpi), and run the ./test/dilute/input file, the error still exists.
I suspect that your code is not compatible with intel compiler or intelmpi. Because I just have intelmpi in the cluster, I can't test the rest part right now.
For now, I'll just use the serial version of lammps (compiled with gcc), if you have other suggestions, please let me know.
Cheers, Xiaohui
Update: Default "mpi" makefile gcc+mpicxx(mpicxx is foudn from the intel/bin path, again I know little about MPI things... sorry), can do the " export N=1; lmp -i input " test run, however, multi-core runs still fail.
Hi Xiaohui!
Thanks for the extensive testing -- I'm usually not in the office Thursdays so I haven't been able to work on this.
It's quite strange because I have routinely run this code on Intel-MPI -- it's valuable to see what could be causing these errors. Could I ask: what is the Newton setting ("newton on/newton off") in the runs, and how does that relate to any errors? Also, how many MPI processes were you running on? Have you tried running multi process runs using the "processor 2" setting?
The other thing is that I usually compile with CMake instead of Make. I don't know if you feel confident trying that -- I will try to work out the Make compiling procedure in my computer and see what happens.
Thanks again for your help!
Cheers Shern
Hi Shern,
I think it doesn't matter whether its newton on/off in the input file, because I can run both using 1 core. I use HPC which has many nodes and 24 cores in each node. The maximum cores I've tested is 1 node, and I just tried those two types, 1 core or 1 node (24 cores).
Sorry, I am not familiar with the cmake approach as one of the package I often use in lammps can only be installed via Make. Could you please tell me the version of these programs you usually use, for example, the version of lammps, version of compiler (gcc or intel?) and MPI. I think it might affects the compatibility of your code.
Cheers, Xiaohui
Hi Xiaohui,
I've located something that seems to make a difference on my machine. Can you change the library line so that it links to the LP64 rather than the ILP64 interface of Intel MKL? That is, do this:
LIB = -ltbbmalloc -lmkl_intel_lp64 -lmkl_sequential -lmkl_core
and see if it makes a difference. I don't know if it will solve the pppm_conp_intel issue. Let me get back to you on that.
Hi Xiaohui,
Regarding the problems with pppm_conp_intel, can I confirm if you are seeing messages about "multiple definition of LAMMPS_NS::PPPMIntel::<function name>
"? If so I think I know what the problem is (inheritance of templated functions), and I should be able to fix it within the next week or so.
Otherwise, let me know what the exact error messages are.
Thanks again!
Shern
Hi Shern,
Thanks for your suggestion. Finally, I think I find the problem!
I changed the -lmkl_intel_lp64 and recompiled the code, then tested 1 core, it works. Then, this time, I tested 2, 4, 6 ... cores, and they all work well until 8 cores. So I think it may because of the limitation of the memory. Then I ran this on a large node (1T memory), it still not work. Could you please tell me what could I do?
BTW, this time I didn't add pppm_conp_intel.* files. I think we can check this later after the memory thing is fixed.
Cheers, Xiaohuiu
Interesting. I was able to replicate your error.
I found that when I ran on 8 processes but with the command "processors 2" uncommented, the code was able to run (this is on my laptop so it won't be a memory issue!). Can you try that on your side and see if it works?
The fix conp only runs calculations on processes which own electrode atoms, so for most systems a processor grid like "processors 2" is essential for load balancing. (Otherwise processes that own middle blocks of the simulation box, with no electrode atoms, will be waiting and doing nothing while other processes have to do more work.) I had not tested fix conp with other grids before -- in the tests/dilute case, the default grid with 8 procs is 1 x 1 x 8 which seems to break something in the code. Also the tests/dilute case is a very long and narrow box which is quite unlike usual simulation conditions anyway which probably makes the problem worse.
Let me know if "processors 2" fixes it for you. I will also work on the Intel files and see if there is a good fix for that.
Hi Shern,
For the test/dilute input example, the "processors 2" can fix the problem. However, when I test it in my work, it fails again. My work is similar to the dilute input, and the dimension of my unit cell is 20x20x120 Å.
Do you have any suggestion for this case?
Cheers, Xiaohui
Hi Xiaohui,
Could you try the latest code to see if it solves the problem?
Separately, would it be possible for you to send me your input and data files for me to test?
All the best, Shern
Hi Shern,
Thanks a lot for your help! The latest code looks fine at moment. I can run both test/dilute and my input file with 24 cores. However, when I rerun my input, the same error occurs again, i.e., I can only use 1 core, otherwise the program crushed. The interesting thing is when I test the test/dilute input file, and add "export N=5", the program still runs normally.
Hi Xiaohui,
I am not sure what you mean by "when I rerun my input". I have tried running your input file and data multiple times on my laptop and it seems to be okay to me. I might need more detail to understand the exact problem you are running into.
By the way, looking at your input files, I notice that in system.in.init
you have set the z boundaries to fixed and used slab correction in kspace_modify
. However, in input.lmp
you are using the ffield flag and setting an electric field. These are not physically compatible (although LAMMPS will still run) -- finite field mode is meant to be used with a fully 3D periodic setup for increased speed, and it can be used with no modification to the configuration (I can see that you have three layers on one electrode and only one layer on the other -- is that intentional?). See the Dufils paper cited in the README.md for more details.
Hi Shern,
Sorry for the confusion. I mean, my input doesn't show any errors when I ran it. However, after generated the trajectory, I commented out the " fix nvt " and " run " keyword, added the keyword "rerun ./system.lmp dump x y z" to recalculate the energy. This time, I got the same error as before, and can only use one core for this rerun step.
In terms of the ffield flag, thank you for your suggestions. I was just testing all keywords in your code.
Cheers, Xiaohui
Dear Xiaohui,
Please try the latest code and let me know if it works for you. The electrode charge for dynamics runs may come out slightly different (~1 in 1e4 over 500 steps) because I've also corrected a bug that prevented the CPM charges from contributing to the setup forces.
Hi Xiaohui,
Any updates?
Hi Shern,
Sorry for the late reply. I am doing some benchmark simulations to check the performance of this polarizable code + finite field method by comparing with Prof. Mathieu Salanne's code "Metalawalls".
I've tested a simple metal/water interface, it looks OK (haven't check the detailed structure and capacitance yet), however, I found my test file (the one I attached in this issue) for a long timescale. I corrected the mistake that you pointed out, but the temperature fluctuation is still very large, and the water O-H bond always break after about 100 ps. I think my input file may still have mistakes but I can't find out. Could you please help me check that?
Thanks a lot for your help!
Xiaohui
Hi Xiaohui,
Thanks for your feedback! I have not tried out MetalWalls yet -- I am sure it will be a very useful comparison, especially for the metal electrode case (see later).
I have not been able to look in more detail at your files yet, but there have been many recent developments aimed at better modelling for metal electrodes specifically, and adding them into this code is one of my active research interests. I have heard of other people having problems using this code to properly model metals so you are not alone. You can look up the Salanne group's paper on Thomas-Fermi electrode models and Nakano and Sato's 2019 paper on the chemical potential equalisation approach for a detailed discussion of the deficiency of basic (electrostatics-only) CPM MD, and how QM-based extensions would do better. Since this would be a discussion more about the material physics of the situation and not the code, it might be better to continue discussing at my academic email (s.tee@uq.edu.au).
Hi Xiaohui,
Unless there are any further problems, I am marking this issue closed. Hope you enjoy using the code!
Hi Shern,
Thanks for your reply. So now I open a new issue for my compiling problems.
I am building this code on a cluster based on the linux system, and I use the latest Lammps version (8 Apr 2021).
Some packages are installed: ######################### (xhyang)[xhyang@mgt02 src]$ make ps | grep "YES" Installed YES: package CLASS2 Installed YES: package CORESHELL Installed YES: package KSPACE Installed YES: package MANYBODY Installed YES: package MOLECULE Installed YES: package REPLICA Installed YES: package RIGID Installed YES: package USER-INTEL Installed YES: package USER-MISC Installed YES: package USER-MOLFILE Installed YES: package USER-OMP #########################
and the main part of the make file is: ######################### CC = mpiicpc -std=c++11 OPTFLAGS = -xHost -O2 -fp-model fast=2 -no-prec-div -qoverride-limits \ -qopt-zmm-usage=hig -DBUILD_OMP=noh CCFLAGS = -qopenmp -qno-offload -ansi-alias -restrict \ -DLMP_INTEL_USELRT -DLMP_USE_MKL_RNG $(OPTFLAGS) \ -I$(MKLROOT)/include SHFLAGS = -fPIC DEPFLAGS = -M
LINK = mpiicpc -std=c++11 LINKFLAGS = -qopenmp $(OPTFLAGS) -L$(MKLROOT)/lib/intel64/ LIB = -ltbbmalloc -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core SIZE = size
ARCHIVE = ar ARFLAGS = -rc SHLIBFLAGS = -shared #########################
As you can see, I use intel compiler and intel mpi. I cannot fix this problem by adding the "-D BUILD_OMP=no" flag, and the only choice is to remove two pppm_conp_intel. files. However, the program failed when calculating the matrix, then end up with many core. files.
Could you please give me some suggestions?
Please let me know if you need more details.
Cheers, Xiaohui