CompPhysVienna / n2p2

n2p2 - A Neural Network Potential Package
https://compphysvienna.github.io/n2p2/
GNU General Public License v3.0
214 stars 81 forks source link

Training stops suddenly #171

Open yuki731 opened 2 years ago

yuki731 commented 2 years ago

Hello, I use a structure in which hydrogen is adsorbed on the surface of copper , and a combination of several structures with different numbers of adsorbed hydrogen as input.data.My problem is that the training suddenly stops the training after several epochs from the start. The data sizes are 2000, 5000, and 10000, and the problem seems to be particularly severe for the 2000 data. The number of atoms per structure is 400~500. This problem does not arise when using other data, such as copper-only data or data consisting only of structures with the same number of hydrogen adsorbed. Error code is this.

"nnp-train: /sqfs/home/u6b593/eigen-3.4.0/Eigen/src/LU/PartialPivLU_LAPACKE.h:74: static int Eigen::internal::partial_lu_impl<double, StorageOrder, int, -1>::blocked_lu(long, long, double , long, int , int &, int) [with int StorageOrder = 0]: Assertion `info >= 0' failed."

The same error occurred in both eigen3.4.0 and eigen3.3.9. I tried that solution, which I felt was similar to the problem in #91, but it did not completely resolve the issue. Also, I feel that more epochs will be successful if the structure that says WARNING (e.g. WARNING: Structure 394 Atom 133 : 0 neighbors.) in the nnp-train.log file is removed, but I could not solve the problem completely.

I would be glad if you could help me solve my problem.