Questions about NEP descriptors，Why is there an extra factor（2） in the derivation?

gengxingze commented 4 months ago

For two-body descriptors

$Gr = \sum{j} e^{-\eta(R_{ij} - R_s)^2 \cdot fc(R{ij})}$

Using the following two operations is equivalent 1.

            grad_desc[page + numnei * DIM + dim] -= pair;
            grad_desc[page + jj * DIM + dim] += pair;

2.

atom.fx[n1] += f12[0];
atom.fy[n1] += f12[1];
atom.fz[n1] += f12[2];
atom.fx[n2] -= f12[0];
atom.fy[n2] -= f12[1];
atom.fz[n2] -= f12[3];

For multi-body BP descriptors

$Ga = 2^{1-\xi} \sum{j } \sum{k } (1 + \lambda \cos \theta{ijk})^{\xi} \times e^{-\eta(R{ij}^2 + R{ik}^2 )} \times fc(R{ij}) \cdot fc(R{ik})$

The derivation process generally requires the derivation of rij, rik, and rjk, and then assigning the values with a code similar to the following, with ‘pair’ denoting the derivation

              grad_desc[page + numnei * DIM + dim] += -pair_ij - pair_ik;
              grad_desc[page + jj * DIM + dim] += pair_ij - pair_jk;
              grad_desc[page + kk * DIM + dim] += pair_ik + pair_jk;

But in the NEP multibody descriptor, as follows, the same three indicators i,j,k appear but are derived only for rij

$qi^{nl} = \sum{j \neq i} \sum_{k \neq i} gn(r{ij}) \cdot gn(r{ik}) \cdot Pl(\cos \theta{ijk}) $

The derivation in the GPUMD paper is as follows：

$\frac{\partial qi^{nl}}{\partial r{ij}} = 2 \sum_{k \neq i} \frac{\partial gn(r{ij})}{\partial r_{ij}} gn(r{ik}) \frac{r{ij}}{r{ik}} Pl(\cos \theta{ijk}) + 2 \sum_{k \neq i} gn(r{ij}) gn(r{ik}) \frac{\partial Pl(\cos \theta{ijk})}{\partial \cos \theta{ijk}} \times \frac{1}{r{ij}} \left( \frac{r{ik}}{r{ik}} - \frac{r{ij}}{r{ij}} \cos \theta_{ijk} \right) $

Why is there an extra factor of 2 here and no need to derive it for the ik and jk indicators?

brucefan1983 commented 4 months ago

The extra factor of 2 is from the chain rule of derivative (need to be very careful to get it right). In the formalism adopted in GPUMD, there is a quantity called the "partial force", which is $\partial U i / \partial \mathbf{r} {ij}$, and the force on atom $i$ is expressed as (Newton's third law is still valid for many-body potentials, which is not widely recognized)

$$\mathbf{F} i = \sum {j\neq i} \mathbf{F} _{ij}$$

$$\mathbf{F} _{ij} = \partial U i / \partial \mathbf{r} {ij} - \partial U j / \partial \mathbf{r} {ji}$$

An advantage of this formalism is to have clean expressions for virial stress and heat current. In CUDA programming, this formalism also enables to avoid the expensive and random-ordered atomicAdd() function.

The programming approach you gave for the Behler-Parrinello neural network potential is the "conventional" one, where the derivatives of the descriptor with respect to all the independent variables are calculated together. However, in the formalism used in GPUMD, only the "partial force" derivatives are first calculated.

brucefan1983 commented 4 months ago

Moreover, the expression for $\partial q i / \partial \mathbf{r} {ij}$ in my 2021 PRB paper for NEP is not the exact one used for programming. It involves double summations over neighbors, which is quite expensive for large cutoff radius. The actually implementation used spherical harmonics instead of Legendre polynomials (I have used Legendre polynomials in the early prototyping process and they gave equivalent results).

gengxingze commented 4 months ago

So for a BP descriptor such as :

$Gi=2^{1-ζ} ∑{(j,k≠i)} (1+λcosθ{ijk})^{ζ}e^{-η(R{ij}^2+R_{ik}^2 ) }*fc (R{ij} ) fc (R{ik})$ 如果令

$P(\cos\theta{ijk}) = 2^{1-\xi} (1+\lambda \cos\theta{ijk})$

$gn(r{ij})=e^{-ηR_{ij}^2}fc (R{ij})$

then

$G=∑{(j,k≠i)}P(\cos\theta{ijk})gn(r{ij})gn(r{ik})$

Similar to NEP, only ij is derived:

$\frac{\partial G}{\partial r{ij}} = \sum{k\neq i} \frac{\partial gn(r{ij})}{\partial r_{ij}} gn(r{ik}) \frac{r{ij}}{r{ij}} Pl(\cos\theta{ijk}) + \sum_{k\neq i} gn(r{ij}) gn(r{ik}) \frac{\partial Pl(\cos\theta{ijk})}{\partial \cos\theta{ijk}} \times \frac{1}{r{ij}} \left( \frac{r{ik}}{r{ik}} - \frac{r{ij}}{r{ij}} \cos\theta_{ijk} \right)$

Then carefully multiplying by the factor 2, the final force can be written as

$F_{ij} = \frac{\partial Ui}{\partial r{ij}} - \frac{\partial Uj}{\partial r{ji}}$

At this point we then do not need to derive the ik and jk indicators?

brucefan1983 commented 4 months ago

by the factor 2, the final force

Yes. Note that you need to first compute all the $\lbrace \partial U i / \partial \mathbf{r} {ij} \rbrace$, store them in memory and get $\partial U j / \partial \mathbf{r} {ji}$ when you need them. $\partial U j / \partial \mathbf{r} {ji}$ and $\partial U i / \partial \mathbf{r} {ij}$ just differ by an exchange of indices $i \leftrightarrow j$.

gengxingze commented 4 months ago

Thank you for your reply!

brucefan1983 / GPUMD

Questions about NEP descriptors，Why is there an extra factor（2） in the derivation? #623