Laplace potential kernel - rinv evaluation

thomasgillis commented 6 years ago

Dear all,

I am starting with pvFMM and encountering an issue while trying to change the kernel evaluation. I tried to replace the evaluation of rinv = RSQRT_INTRIN by another intrinsic, provided, rsqrt_approx_intrin (which should be more accurate, up to my understanding).

I first try to run the program, without any modification (downloaded 5 march 2018), which goes through the example1 test smoothly.

./bin/exemple1 -N 4096

Maximum Absolute Error:2.47002e-05
Maximum Relative Error:4.5345e-09

Then, changing the evaluation of the rinv intrinsic in the kernel evaluation makes the algorithm blow up (after make clean everything).

./bin/exemple1 -N 4096

Maximum Absolute Error:3.07983e+09
Maximum Relative Error:565400

and runing the potential kernel only, gives me an error in the precomputation of the boundary conditions:

Cheb_Integ::Failed to converge.[6.93278e-09,-0.975,-0.975,-0.975]
Cheb_Integ::Failed to converge.[6.40075e-10,-0.975,-0.647222,-0.975]
Cheb_Integ::Failed to converge.[2.07315e-09,-0.975,-0.647222,-0.647222]
Cheb_Integ::Failed to converge.[2.08389e-09,-0.975,-0.647222,-0.319444]
...
Cheb_Integ::Failed to converge.[4.40053e-10,-0.975,0.00833333,0.663889]
Cheb_Integ::Failed to converge.[3.17001e-09,-0.975,0.00833333,0.991667]
Cheb_Integ::Failed to converge.[2.80505e-09,-0.975,0.00833333,1.31944]
Cheb_Integ::Failed to converge.[4.60546e-10,-0.975,0.00833333,1.64722]
Cheb_Integ::Failed to converge.[7.76104e-10,-0.975,0.336111,-0.975]
Cheb_Integ::Failed to converge.[8.40628e-11,-0.975,0.336111,-0.647222]
Cheb_Integ::Failed to converge.[1.85214e-09,-0.975,0.336111,-0.319444]
Cheb_Integ::Failed to converge.[4.40053e-10,-0.975,0.336111,0.00833333]

Here is the git diff of the downloaded code. I checked the kernel and the value only differs from each other by 1e-4 to 1e-6.

Do you have any solution to that issue?

diff --git a/include/kernel.txx b/include/kernel.txx
index 7867086..5abd5df 100755
--- a/include/kernel.txx
+++ b/include/kernel.txx
@@ -1088,7 +1088,7 @@ void laplace_poten_uKernel(Matrix<Real_t>& src_coord, Matrix<Real_t>& src_value,
   for(int i=0;i<NWTN_ITER;i++){
     nwtn_scal=2*nwtn_scal*nwtn_scal*nwtn_scal;
   }
-  const Real_t OOFP = 1.0/(4*nwtn_scal*const_pi<Real_t>());
+  const Real_t OOFP = 1.0/(4*const_pi<Real_t>());

   size_t src_cnt_=src_coord.Dim(1);
   size_t trg_cnt_=trg_coord.Dim(1);
@@ -1110,7 +1110,7 @@ void laplace_poten_uKernel(Matrix<Real_t>& src_coord, Matrix<Real_t>& src_value,
         r2=add_intrin(r2,mul_intrin(dy,dy));
         r2=add_intrin(r2,mul_intrin(dz,dz));

-        Vec_t rinv=RSQRT_INTRIN(r2);
+        Vec_t rinv=rsqrt_approx_intrin(r2);
         tv=add_intrin(tv,mul_intrin(rinv,sv));
       }
       Vec_t oofp=set_intrin<Vec_t,Real_t>(OOFP);

EDIT

When using the potential kernel, by removing the scale invariance boolean and the dbl_layer kernel, i managed to reduce the error:

@@ -1405,8 +1405,8 @@ void laplace_grad(T* r_src, int src_cnt, T* v_src, int dof, T* r_trg, int trg_cn
 template<class T> const Kernel<T>& LaplaceKernel<T>::potential(){
-  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,1>, laplace_dbl_poten<T,1> >("laplace"     , 3, std::pair<int,int>(1,1),
-      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL, &laplace_vol_poten<T>);
+  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,1> >("laplace"     , 3, std::pair<int,int>(1,1),
+      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL,NULL,false);
   return potn_ker;
 }
 template<class T> const Kernel<T>& LaplaceKernel<T>::gradient(){
@@ -1418,8 +1418,8 @@ template<class T> const Kernel<T>& LaplaceKernel<T>::gradient(){

 template<> inline const Kernel<double>& LaplaceKernel<double>::potential(){
   typedef double T;
-  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,2>, laplace_dbl_poten<T,2> >("laplace"     , 3, std::pair<int,int>(1,1),
-      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL, &laplace_vol_poten<double>);
+  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,2> >("laplace"     , 3, std::pair<int,int>(1,1),
+      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL,NULL,false);
   return potn_ker;
 }
 template<> inline const Kernel<double>& LaplaceKernel<double>::gradient(){

gives

Maximum Absolute Error:21.2006
Maximum Relative Error:0.0541196

PS: I am runing on mac OS and compiled with

./configure MPICXX=mpic++ CXX=icpc CC=icc F77=ifort CXXFLAGS="-mavx -g -std=c++11" CFLAGS="-mavx -g" FFLAGS="-mavx -g" --with-openmp-flag="qopenmp" --with-fftw-include="${FFTW_INC}" --with-fftw-lib="-mkl" --with-blas="-mkl" --with-lapack="-mkl" --disable-doxygen-doc --disable-doxygen-dot --disable-doxygen-html

where FFTW_INC refers to the the include dir of MKL/fftw

dmalhotra commented 6 years ago

rsqrt_approx_intrin is less accurate (_mm_rsqrt_ps).
You need to manually delete the precomputed files after making changes to the kernel function.

How these relate to your observations:

I believe in the first case, the errors are so large because you did not delete the precomputed files.
You get convergence warnings (not errors) during precomputation because the kernel function is less accurate. The warnings here are irrelevant since that part of the code is only used for periodic boundary conditions. The warning goes away if the volume-potential term in kernel construction is replace by NULL. The double-layer kernel and scale-invariance condition don't make any difference to the warnings.
In the second case you get much more reasonable relative errors (0.05 for a kernel function only accurate to 4-digits) because changing the scale-invariance condition required that the precomputed data had to be computed again for tree levels > 0. You could have achieved the same effect by deleting the precomputed files.

thomasgillis commented 6 years ago

Thank you very much for the detailed information. Helped a lot! Is it possible to disable the Precomp_XX writing using some flags?

dmalhotra commented 6 years ago

There is no option to disable writing the precomputed file, but you can hack it. You can use the configure option --with-precomp-dir="some-dir-path" to specify the directory where the file is stored. If you give a path which doesn't exist, then the file can not be written.

dmalhotra / pvfmm

Laplace potential kernel - rinv evaluation #7