Added a new class ParallelSGP to build SGP from a large, fixed training data set using MPI, not on-the-fly construction. The construction of SGP consists of two processes:
Load, distribute training data and compute descriptors, by method load_local_training_data
Compute kernel matrices and vectors from the training set, by method compute_matrices
Method build is a general interface, which calls load_local_training_data and compute_matrices
Added unit test: tests/test_parallel_sgp.cpp
To run the unit test, try ./tests for serial version,
and mpirun -n <N_procs> ./tests for parallel version, N_procs=1, 4, 9, 16,...
Added timing test of MPI: timing/mpi_construction.cpp
To run the timing test, go into the folder build/timing, and run mpirun -n <N_procs> ./time_mpi <N_strucs>, N_strucs is the number of training structures, each structure will have 100 atoms and 5 sparse envs
Added .xyz file reader. Since the distmatrix does not support python binding, the ParallelSGP can not build python interface, thus python io like ASE can not be used to convert data in file into arrays. So a c++ reader for xyz file is provided
src/flare_pp/utils.cpp
flare_pp/utils.py
test_utils.cpp
Dependencies
OpenMPI. Below are the modules that need to be loaded before running cmake
[x] Timing test of MPI: timing/mpi_construction.cpp
[x] .xyz file reader.
[x] multiple kernels/descriptors
[x] Pull branch #20
[x] Add likelihood and gradient evaluation
[ ] Add more docs
[x] docs for python api
[ ] Move some printings to the debug flag
[ ] The distmatrix needs to be public @anjohan
Lower priority:
Add update_matrix to support OTF training (theoretically we can still do OTF with current ParallelSGP, but each time it construct on the whole dataset, instead of update matrices with the new added data)
Features
Added a new class
ParallelSGP
to build SGP from a large, fixed training data set using MPI, not on-the-fly construction. The construction of SGP consists of two processes:load_local_training_data
compute_matrices
build
is a general interface, which callsload_local_training_data
andcompute_matrices
Added unit test:
tests/test_parallel_sgp.cpp
./tests
for serial version,mpirun -n <N_procs> ./tests
for parallel version, N_procs=1, 4, 9, 16,...Added timing test of MPI:
timing/mpi_construction.cpp
build/timing
, and runmpirun -n <N_procs> ./time_mpi <N_strucs>
, N_strucs is the number of training structures, each structure will have 100 atoms and 5 sparse envsAdded
.xyz
file reader. Since thedistmatrix
does not support python binding, theParallelSGP
can not build python interface, thus python io like ASE can not be used to convert data in file into arrays. So a c++ reader for xyz file is providedsrc/flare_pp/utils.cpp
flare_pp/utils.py
test_utils.cpp
Dependencies
cmake
Distmatrix
by Anders. Added toCMakeLists.txt
Todos
tests/test_parallel_sgp.cpp
timing/mpi_construction.cpp
.xyz
file reader.distmatrix
needs to be public @anjohanLower priority:
update_matrix
to support OTF training (theoretically we can still do OTF with current ParallelSGP, but each time it construct on the whole dataset, instead of update matrices with the new added data)distmatrix
class needs python binding