MDIL-SNU / SIMPLE-NN

SIMPLE-NN(SNU Interatomic Machine-learning PotentiaL packagE – version Neural Network)
GNU General Public License v3.0
47 stars 23 forks source link

A bug in computing neighbors #92

Open pulkin opened 4 years ago

pulkin commented 4 years ago

I suspect there is a bug which misses some pairs. Following is the information I can share right now:

I cannot reproduce specific descriptors from structure 334 in the SiO2 example you provide. The index 334 is counting from zero. It corresponds to index 3340 (counting from zero) of the original chain of VASP structures in OUTCAR_comp. The mismatch is (here and further counting is from zero):

There may be other discrepancies which are below numerical threshold 1e-12.

I think I nailed down the problem with pair descriptors of Oxygen 5 and 28: there, expectedly, the pair O#5-O#28 is missing from sums. This is what I computed vs what you code gives (I was reading descriptors from individual pickle files your example saves at some point):

Oxygen #5 O-O descriptors
eta=0.0032 mine=8.174 yours=8.154 diff=2.002e-02
eta=0.0357 mine=5.556 yours=5.548 diff=7.678e-03
eta=0.0714 mine=3.785 yours=3.782 diff=2.678e-03
eta=0.1250 mine=2.255 yours=2.254 diff=5.515e-04
eta=0.2143 mine=1.048 yours=1.048 diff=3.962e-05
eta=0.3571 mine=0.349 yours=0.349 diff=5.861e-07
eta=0.7142 mine=0.028 yours=0.028 diff=1.560e-11

Assuming only a single term is missing it is easy to deduce that your implementation differs from my implementation by one Behler function with r=5.43115469 which is present in my case but absent in yours. This distance corresponds almost exactly to the distance between O#5-O#28 in neighboring supercells.

I do not know where exactly the problem is but I kindly ask to investigate this. Other 399 cells which I tested seem to show perfect agreement.

Nanco-L commented 4 years ago

Hi, pulkin. I don'n know the exact situation but your selection of cutoff distance r=5.43... maybe differ with the distance of input.yaml. The value of symmetry function is the summation of each neighbor term. Thus, different cutoff leads different value. The origin of missing pairs might be different cutoff distance I think. If the results are still different when you select same cutoff distance, please notice me or other developers.

pulkin commented 4 years ago

I use the provided SiO2 example with the cutoff distance 6 angstrom. I did not change anything there.

Nanco-L commented 4 years ago

Ok. I misunderstood your question. @JisuJung928, could you check the structure @pulkin said? I'll check the structure if possible.

JisuJung928 commented 4 years ago

Hi pulkin.

First, I checked our code calculates the symmetry functions as you mentioned. (eta=0.0032 -> 8.154...) I think your oxygen 5 indicates the 6th oxygen because I found the symmetry functions value in pickle['x']['O'][5] of index 3340 (counting from zero) in OUTCAR. So your oxygen 5 and 28 are 30th and 53rd atoms in the structure (Si: 24 + O: 48).

I modified our code to check the neighbor list thoroughly. However, my result is different from yours. 30th atom thinks 53rd atom as a neighbor, and their distance is 5.305323 Å.

Also, I calculate them by opening OUTCAR manually. According to OUTCAR_comp, the positions of 30th and 53rd oxygen are (0.987773, 5.53254, 10.3442) and (9.40916, 0.749697, 1.34771). If we consider the periodic boundary, their distance is the same as my above result. (5.30533 Å)

I think we discussed the same structure anyway because I found the same symmetry function value. Could you check the target atom or structure which your own code calculated?

pulkin commented 4 years ago

Yes, I will check it. What I can say right now is that there are several distances between these two atoms and r=5.43115469 corresponds to oxygens in neighboring cells, not to oxygens in the same cell.

JisuJung928 commented 4 years ago

Finally, I find what makes our results different. As you said, there are several distances between these two atoms. For example, our code calculated the distance between 30th (if it is the origin) and 53rd oxygen in [-1,0,1] cell, and it is r=5.305323. However, your code also considers the 53rd oxygen in [0,1,1] cell, and its distance is r=5.431155 which is omitted in ours. Thanks for your reporting. I will try to fix them soon. @Nanco-L

Nanco-L commented 4 years ago

Thank you, @JisuJung928 and @pulkin. We need to notice this bug to our members and need to fix it ASAP.

JisuJung928 commented 4 years ago

Hello pulkin. We have tried to figure out the reported error. However, it cannot be reproduced consistently. Now it calculated the right value as you mentioned. Could you try to compile simple-nn after adding extra_compile_args=["-O0"] in ffibuilder.set_source of simple_nn/features/symmetry_function/libsymf_builder.py? Sometimes optimization algorithm might cause an error.

pulkin commented 4 years ago

Unfortunately, I do not have enough time to debug this issue. If optimization options affect results beyond numerical precision then it is still a problem. I suggest you to set up CI and tests where you will be able to sort such things out in a more consistent manner.

JisuJung928 commented 3 years ago

Hello, pulkin.

It might come from the atomic simulation environment (ASE) module. Recently, we found that a bug from the ASE module when reading the coordination of atoms depending on the version. Current latest version resolves that problem. Please upgrade your SIMPLE-NN.