PrincetonUniversity / SPEC

The Stepped-Pressure Equilibrium Code, an advanced MRxMHD equilibrium solver.
https://princetonuniversity.github.io/SPEC/
GNU General Public License v3.0
24 stars 4 forks source link

Optimization of virtual casing method in SPEC #32

Open lazersos opened 6 years ago

lazersos commented 6 years ago

The virtual casing method in SPEC is far from optimal for the problem at hand. We should address this.

Background links

The integration of the equations of propagation of electric waves Use of the virtual-casing principle in calculating the containing magnetic field in toroidal plasma systems The virtual-casing principle for 3D toroidal systems A magnetic diagnostic code for 3D fusion equilibria The virtual-casing principle and Helmholtz's theorem

Background info from @SRHudson

It seems that there was some bug in replacing the previous NAG routine with DCUHRE, but this is not the main problem. If anyone wants to fix this bug (I started, and have committed/pushed my changes, but I need to go home now.)

In SPEC we need to calculate the magnetic field on a regular grid, Nt in theta and Nz in zeta, on the computational boundary. For each one of these Ntz = Nt Nz virtual casing integrals, we need to perform a surface integral over the plasma boundary. I think that the best algorithm in SPEC would be to compute the geometry and tangential field on the plasma boundary just once, and then to use this information for all of the virtual casing integrals required on the Nt Nz grid on the computational boundary. The point is that there is a lot of information that is common, so if we compute the common information only once then SPEC will be faster.

Presently I am using a slow Fourier transform to compute the magnetic field at whatever point that DCUHRE requires, and because DCUHRE is adaptive, the points on the surface where DCUHRE requires the geometry/tangential field of the plasma boundary will vary as the position on the computational boundary varies. One approach, which I first suggested, is to use/develop a quadrature algorithm that only requires the integrand on a fixed regular grid. This will probably be fasted, but we will potentially lose the accuracy that adaptive routines usually guarantee. Another approach is to first compute the geometry/tangential field on the plasma boundary on a regular grid and thereafter compute the geometry/tangential field using splines (or something fast). I am not sure that this will really solve the problem, because if we need such adaptivity it probably means we need to run SPEC at higher Fourier resolution in any case.

lazersos commented 6 years ago

In the STELLOPT virtual_casing_mod.f90 module used in DIAGNO, FIELDLINES, and BEAMS3D, we construct splines over the surface of the required quantities. This alleviates some of the overhead of evaluating the trigonometric functions but isn't necessarily optimal for this problem.

To calculate the field from the virtual casing we need three vector quantities. The field on the plasma vacuum interface (or rather the related surface current), the position where we want to evaluate the field, and the position of the interface. Now in SPEC the second quantity does not change during the simulation so this can be computed once on a finite grid. The position of the interface and the surface current change quite a bit.
Now the DCUHRE algorithm is nice in that it will evaluate the integral to a defined tolerance by refining the mesh of points over which it's sampling. This is good if for example the surface current is approaching the point at which we wish to evaluate the field. However, this can be problematic as well, as the whole processes can become computationally expensive.

A hack is to apply stellarator symmetry to reduce the number of points at which the field needs to be evaluated. As it turns out for stellarators we only need to evaluate the field on a grid from phi=[0,pi/NFP] and theta=[0,pi]. This is because of the mirror symmetry. Thus compared to a non-stellarator symmetric simulation the number of points at which the field needs to be evaluated is reduced by a factor of 4. BTW, VMEC does this when sampling the vacuum field.

However, this is just a 'trick.' Another method one could employ is to pre-calculate the Fourier terms into a matrix. If we always want to evaluate quantities on a known (u,v) mesh then evaluating the Fourier double integral is just a matrix multiply and only requires an initial evaluate of the trig coefficients, see this following code (from virtual_casing_mod.f90)

      SUBROUTINE uvtomn_local(k1,k,mnmax,nu,nv,xu,xv,fmn,xm,xn,f,signs,calc_trig)
      IMPLICIT NONE
      ! INPUT VARIABLES
      INTEGER, INTENT(in) :: k1
      INTEGER, INTENT(in) :: k
      INTEGER, INTENT(in) :: mnmax
      INTEGER, INTENT(in) :: nu
      INTEGER, INTENT(in) :: nv
      DOUBLE PRECISION, INTENT(in) :: xu(1:nu)
      DOUBLE PRECISION, INTENT(in) :: xv(1:nv)
      DOUBLE PRECISION, INTENT(inout) :: fmn(1:mnmax,k1:k)
      INTEGER, INTENT(in) :: xm(1:mnmax)
      INTEGER, INTENT(in) :: xn(1:mnmax)
      DOUBLE PRECISION, INTENT(in) :: f(1:nu,1:nv,k1:k)
      INTEGER, INTENT(in) :: signs
      INTEGER, INTENT(in) :: calc_trig
      ! LOCAL VARIABLES
      INTEGER     :: mn, i, j, ier, ik
      DOUBLE PRECISION :: xn_temp(1:mnmax,1)
      DOUBLE PRECISION :: xm_temp(1:mnmax,1)
      DOUBLE PRECISION :: pi2_l
      DOUBLE PRECISION :: mt(1:mnmax,1:nu)
      DOUBLE PRECISION :: nz(1:mnmax,1:nv)
      DOUBLE PRECISION :: xu_temp(1,1:nu)
      DOUBLE PRECISION :: xv_temp(1,1:nv)
      DOUBLE PRECISION, ALLOCATABLE, SAVE :: fnuv(:)
      DOUBLE PRECISION, ALLOCATABLE, SAVE :: cosmt(:,:)
      DOUBLE PRECISION, ALLOCATABLE, SAVE :: sinmt(:,:)
      DOUBLE PRECISION, ALLOCATABLE, SAVE :: cosnz(:,:)
      DOUBLE PRECISION, ALLOCATABLE, SAVE :: sinnz(:,:)
      ! BEGIN SUBROUTINE
      pi2_l = 8.0D+0 * ATAN(1.)
      IF (calc_trig == 1) THEN
         IF (ALLOCATED(cosmt)) DEALLOCATE(cosmt)
         IF (ALLOCATED(sinmt)) DEALLOCATE(sinmt)
         IF (ALLOCATED(cosnz)) DEALLOCATE(cosnz)
         IF (ALLOCATED(sinnz)) DEALLOCATE(sinnz)
         IF (ALLOCATED(fnuv)) DEALLOCATE(fnuv)
         ALLOCATE(fnuv(1:mnmax),STAT=ier)
         ALLOCATE(cosmt(1:mnmax,1:nu),sinmt(1:mnmax,1:nu),&
                  cosnz(1:mnmax,1:nv),sinnz(1:mnmax,1:nv),STAT=ier)
         FORALL(i=1:mnmax) xm_temp(i,1)=DBLE(xm(i))
         FORALL(i=1:mnmax) xn_temp(i,1)=DBLE(xn(i))
         FORALL(i=1:mnmax) fnuv(i) = 2.0D+0/DBLE(nu*nv)
         WHERE(xm == 0) fnuv = 5.0D-1*fnuv
         FORALL(i=1:nu) xu_temp(1,i)=xu(i)
         FORALL(i=1:nv) xv_temp(1,i)=xv(i)
         mt = MATMUL(xm_temp,xu_temp)
         nz = MATMUL(xn_temp,xv_temp)
         FORALL(mn=1:mnmax,i=1:nu) cosmt(mn,i) = dcos(pi2_l*mt(mn,i))
         FORALL(mn=1:mnmax,i=1:nu) sinmt(mn,i) = dsin(pi2_l*mt(mn,i))
         FORALL(mn=1:mnmax,i=1:nv) cosnz(mn,i) = dcos(pi2_l*nz(mn,i))
         FORALL(mn=1:mnmax,i=1:nv) sinnz(mn,i) = dsin(pi2_l*nz(mn,i))
      END IF
      fmn = zero
      IF (signs == 0) THEN
         DO mn = 1, mnmax
            DO i = 1, nu
               DO j = 1, nv
                  fmn(mn,k1:k) = fmn(mn,k1:k) + f(i,j,k1:k)*(cosmt(mn,i)*cosnz(mn,j) &
                  - sinmt(mn,i)*sinnz(mn,j))*fnuv(mn)
               END DO
            END DO
         END DO
      ELSE IF (signs == 1) THEN
         DO mn = 1, mnmax
            DO i = 1, nu
               DO j = 1, nv
                  fmn(mn,k1:k) = fmn(mn,k1:k) + f(i,j,k1:k)*(sinmt(mn,i)*cosnz(mn,j) &
                  + cosmt(mn,i)*sinnz(mn,j))*fnuv(mn)
               END DO
            END DO
         END DO
      END IF
      ! END SUBROUTINE
      END SUBROUTINE uvtomn_local

If we get clever about the grid spacing then it may be possible to just to integrate over fixed grids removing the overhead of DCUHRE.

lazersos commented 6 years ago

@SRHudson Regarding your debugging of DCUHRE here's the code used by vitual_casing_mod.f90

      SUBROUTINE bfield_virtual_casing_adapt_dbl(x,y,z,bx,by,bz,istat)
      IMPLICIT NONE
      ! INPUT VARIABLES
      DOUBLE PRECISION, INTENT(in)  :: x, y, z
      DOUBLE PRECISION, INTENT(out) :: bx, by, bz
      INTEGER, INTENT(inout) :: istat
      ! LOCAL VARIABLES
      LOGICAL            :: adapt_rerun
      INTEGER(KIND=8), PARAMETER :: ndim_nag = 2 ! theta,zeta
      INTEGER(KIND=8), PARAMETER :: nfun_nag = 3 ! Bx, By, Bz
      INTEGER(KIND=8), PARAMETER :: lenwrk_nag = IWRK
      INTEGER(KIND=8) :: maxcls_nag,mincls_nag, subs, restar, wrklen, rulcls, wrksbs, n, m, funcls
      DOUBLE PRECISION :: absreq_nag, relreq_nag
      DOUBLE PRECISION :: wrkstr_nag(lenwrk_nag)
      DOUBLE PRECISION :: a_nag(ndim_nag), b_nag(ndim_nag), &
                          finest_nag(nfun_nag), absest_nag(nfun_nag)
      DOUBLE PRECISION, ALLOCATABLE :: vrtwrk(:)

#ifdef NAG
      EXTERNAL :: D01EAF

#else
      EXTERNAL :: dcuhre

#endif

      ! BEGIN SUBROUTINE
      IF (adapt_tol < 0) THEN
         CALL bfield_virtual_casing_dbl(x,y,z,bx,by,bz)
         RETURN
      END IF

      a_nag(1:2) = zero
      b_nag(1:2) = one
      mincls_nag = MIN_CLS
      maxcls_nag = IWRK
      !absreq_nag = zero
      absreq_nag = adapt_tol       ! Talk to Stuart about proper values
      relreq_nag = adapt_rel ! Talk to Stuart about proper values
      finest_nag = zero
      absest_nag = zero
      x_nag      = x
      y_nag      = y
      z_nag      = z
      adapt_rerun = .true.
      subs = 1
      restar = 0
      DO WHILE (adapt_rerun) 

#ifdef NAG
         CALL D01EAF(ndim_nag,a_nag,b_nag,mincls_nag,maxcls_nag,nfun_nag,funsub_nag_b,absreq_nag,&
                   relreq_nag,lenwrk_nag,wrkstr_nag,finest_nag,absest_nag,istat)
         IF (istat == 1 .or. istat == 3) THEN
            maxcls_nag = maxcls_nag*10
            mincls_nag = -1
            restar = 1
            WRITE(6,*) '!!!!!  WARNING Could not reach desired tollerance  !!!!!'
            WRITE(6,*) '  BX = ',finest_nag(1),' +/- ',absest_nag(1)
            WRITE(6,*) '  BY = ',finest_nag(2),' +/- ',absest_nag(2)
            WRITE(6,*) '  BZ = ',finest_nag(3),' +/- ',absest_nag(3)
            WRITE(6,*) '!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
         ELSE IF (mincls_nag <= 32) THEN
            mincls_nag = 64
            restar = 1
         ELSE IF (istat < 0) THEN
            bx = zero
            by = zero
            bz = zero
            adapt_rerun=.false.
         ELSE
            bx = finest_nag(1)
            by = finest_nag(2)
            bz = finest_nag(3)
            adapt_rerun=.false.
         END IF

#else
         IF (.not.ALLOCATED(vrtwrk)) THEN
            wrklen = ((maxcls_nag-ndim_nag)/(2*ndim_nag) + 1)*(2*ndim_nag+2*nfun_nag+2) + 17*nfun_nag + 256
            ALLOCATE(vrtwrk(wrklen),STAT=istat)
            IF (istat .ne. 0) THEN
               WRITE(6,*) ' ALLOCATION ERROR IN: bfield_virtual_casing_adapt_dbl'
               WRITE(6,*) '   VARIABLE: VRTWRK, SIZE: ',wrklen
               WRITE(6,*) '   ISTAT: ',istat
               RETURN
            END IF
         END IF
         CALL dcuhre(ndim_nag,nfun_nag,a_nag,b_nag,mincls_nag,maxcls_nag,funsub_nag_b,absreq_nag,&
                     relreq_nag,0,wrklen,restar,finest_nag,absest_nag,funcls,istat,vrtwrk)
         IF (istat == 1) THEN
            maxcls_nag = maxcls_nag*10
            mincls_nag = funcls
            restar = 1
            istat = 0
         ELSE IF (istat > 1) THEN
            bx = zero
            by = zero
            bz = zero
            adapt_rerun=.false.
            DEALLOCATE(vrtwrk)
         ELSE
            bx = finest_nag(1)
            by = finest_nag(2)
            bz = finest_nag(3)
            adapt_rerun=.false.
            DEALLOCATE(vrtwrk)
         END IF

#endif
      END DO
      nlastcall=mincls_nag
      RETURN
      ! END SUBROUTINE
      END SUBROUTINE bfield_virtual_casing_adapt_dbl

      SUBROUTINE funsub_nag_b(ndim, vec, nfun, f)
      IMPLICIT NONE
      ! INPUT VARIABLES
      INTEGER, INTENT(in) :: ndim, nfun
      DOUBLE PRECISION, INTENT(in) :: vec(ndim)
      DOUBLE PRECISION, INTENT(out) :: f(nfun)
      ! LOCAL VARIABLES
      INTEGER :: ier
      DOUBLE PRECISION :: bn, bx, by, bz , xs, ys, zs, gf, gf3, nx, ny, nz, ax, ay, az
      ! BEGIN SUBROUTINE

      xs = zero; ys = zero; zs = zero
      CALL EZspline_interp(x_spl,vec(1),vec(2),xs,ier)
      CALL EZspline_interp(y_spl,vec(1),vec(2),ys,ier)
      CALL EZspline_interp(z_spl,vec(1),vec(2),zs,ier)
      CALL EZspline_interp(kx_spl,vec(1),vec(2),ax,ier)
      CALL EZspline_interp(ky_spl,vec(1),vec(2),ay,ier)
      CALL EZspline_interp(kz_spl,vec(1),vec(2),az,ier)

      bn = zero
      gf   = one/DSQRT((x_nag-xs)*(x_nag-xs)+(y_nag-ys)*(y_nag-ys)+(z_nag-zs)*(z_nag-zs))
      gf3  = gf*gf*gf
      f(1) = norm_nag*(ay*(z_nag-zs)-az*(y_nag-ys)+bn*(x_nag-xs))*gf3
      f(2) = norm_nag*(az*(x_nag-xs)-ax*(z_nag-zs)+bn*(y_nag-ys))*gf3
      f(3) = norm_nag*(ax*(y_nag-ys)-ay*(x_nag-xs)+bn*(z_nag-zs))*gf3
      RETURN
      END SUBROUTINE funsub_nag_b