dmalhotra / pvfmm

A parallel kernel-independent FMM library for particle and volume potentials
http://pvfmm.org
GNU Lesser General Public License v3.0
51 stars 28 forks source link

Laplace potential kernel - rinv evaluation #7

Closed thomasgillis closed 6 years ago

thomasgillis commented 6 years ago

Dear all,

I am starting with pvFMM and encountering an issue while trying to change the kernel evaluation. I tried to replace the evaluation of rinv = RSQRT_INTRIN by another intrinsic, provided, rsqrt_approx_intrin (which should be more accurate, up to my understanding).

I first try to run the program, without any modification (downloaded 5 march 2018), which goes through the example1 test smoothly.

./bin/exemple1 -N 4096

Maximum Absolute Error:2.47002e-05
Maximum Relative Error:4.5345e-09

Then, changing the evaluation of the rinv intrinsic in the kernel evaluation makes the algorithm blow up (after make clean everything).

./bin/exemple1 -N 4096

Maximum Absolute Error:3.07983e+09
Maximum Relative Error:565400

and runing the potential kernel only, gives me an error in the precomputation of the boundary conditions:

Cheb_Integ::Failed to converge.[6.93278e-09,-0.975,-0.975,-0.975]
Cheb_Integ::Failed to converge.[6.40075e-10,-0.975,-0.647222,-0.975]
Cheb_Integ::Failed to converge.[2.07315e-09,-0.975,-0.647222,-0.647222]
Cheb_Integ::Failed to converge.[2.08389e-09,-0.975,-0.647222,-0.319444]
...
Cheb_Integ::Failed to converge.[4.40053e-10,-0.975,0.00833333,0.663889]
Cheb_Integ::Failed to converge.[3.17001e-09,-0.975,0.00833333,0.991667]
Cheb_Integ::Failed to converge.[2.80505e-09,-0.975,0.00833333,1.31944]
Cheb_Integ::Failed to converge.[4.60546e-10,-0.975,0.00833333,1.64722]
Cheb_Integ::Failed to converge.[7.76104e-10,-0.975,0.336111,-0.975]
Cheb_Integ::Failed to converge.[8.40628e-11,-0.975,0.336111,-0.647222]
Cheb_Integ::Failed to converge.[1.85214e-09,-0.975,0.336111,-0.319444]
Cheb_Integ::Failed to converge.[4.40053e-10,-0.975,0.336111,0.00833333]

Here is the git diff of the downloaded code. I checked the kernel and the value only differs from each other by 1e-4 to 1e-6.

Do you have any solution to that issue?

diff --git a/include/kernel.txx b/include/kernel.txx
index 7867086..5abd5df 100755
--- a/include/kernel.txx
+++ b/include/kernel.txx
@@ -1088,7 +1088,7 @@ void laplace_poten_uKernel(Matrix<Real_t>& src_coord, Matrix<Real_t>& src_value,
   for(int i=0;i<NWTN_ITER;i++){
     nwtn_scal=2*nwtn_scal*nwtn_scal*nwtn_scal;
   }
-  const Real_t OOFP = 1.0/(4*nwtn_scal*const_pi<Real_t>());
+  const Real_t OOFP = 1.0/(4*const_pi<Real_t>());

   size_t src_cnt_=src_coord.Dim(1);
   size_t trg_cnt_=trg_coord.Dim(1);
@@ -1110,7 +1110,7 @@ void laplace_poten_uKernel(Matrix<Real_t>& src_coord, Matrix<Real_t>& src_value,
         r2=add_intrin(r2,mul_intrin(dy,dy));
         r2=add_intrin(r2,mul_intrin(dz,dz));

-        Vec_t rinv=RSQRT_INTRIN(r2);
+        Vec_t rinv=rsqrt_approx_intrin(r2);
         tv=add_intrin(tv,mul_intrin(rinv,sv));
       }
       Vec_t oofp=set_intrin<Vec_t,Real_t>(OOFP);

EDIT

When using the potential kernel, by removing the scale invariance boolean and the dbl_layer kernel, i managed to reduce the error:

@@ -1405,8 +1405,8 @@ void laplace_grad(T* r_src, int src_cnt, T* v_src, int dof, T* r_trg, int trg_cn
 template<class T> const Kernel<T>& LaplaceKernel<T>::potential(){
-  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,1>, laplace_dbl_poten<T,1> >("laplace"     , 3, std::pair<int,int>(1,1),
-      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL, &laplace_vol_poten<T>);
+  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,1> >("laplace"     , 3, std::pair<int,int>(1,1),
+      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL,NULL,false);
   return potn_ker;
 }
 template<class T> const Kernel<T>& LaplaceKernel<T>::gradient(){
@@ -1418,8 +1418,8 @@ template<class T> const Kernel<T>& LaplaceKernel<T>::gradient(){

 template<> inline const Kernel<double>& LaplaceKernel<double>::potential(){
   typedef double T;
-  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,2>, laplace_dbl_poten<T,2> >("laplace"     , 3, std::pair<int,int>(1,1),
-      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL, &laplace_vol_poten<double>);
+  static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,2> >("laplace"     , 3, std::pair<int,int>(1,1),
+      NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL,NULL,false);
   return potn_ker;
 }
 template<> inline const Kernel<double>& LaplaceKernel<double>::gradient(){

gives

Maximum Absolute Error:21.2006
Maximum Relative Error:0.0541196

PS: I am runing on mac OS and compiled with

./configure MPICXX=mpic++ CXX=icpc CC=icc F77=ifort CXXFLAGS="-mavx -g -std=c++11" CFLAGS="-mavx -g" FFLAGS="-mavx -g" --with-openmp-flag="qopenmp" --with-fftw-include="${FFTW_INC}" --with-fftw-lib="-mkl" --with-blas="-mkl" --with-lapack="-mkl" --disable-doxygen-doc --disable-doxygen-dot --disable-doxygen-html

where FFTW_INC refers to the the include dir of MKL/fftw

dmalhotra commented 6 years ago
  1. rsqrt_approx_intrin is less accurate (_mm_rsqrt_ps).
  2. You need to manually delete the precomputed files after making changes to the kernel function.

How these relate to your observations:

thomasgillis commented 6 years ago

Thank you very much for the detailed information. Helped a lot! Is it possible to disable the Precomp_XX writing using some flags?

dmalhotra commented 6 years ago

There is no option to disable writing the precomputed file, but you can hack it. You can use the configure option --with-precomp-dir="some-dir-path" to specify the directory where the file is stored. If you give a path which doesn't exist, then the file can not be written.