Virtual casing de-nagged routine shows problems

jloizu commented 6 years ago

Sam,

You did replace the virtual casing routine D01EAF with DCUHRE on July 20th 2017:

https://github.com/PrincetonUniversity/SPEC/pull/5/commits/7f0194f1b7d8dd628569932ead0a246aabbef694

When I try running a free-boundary case (with mfreeits>0) I get screen output saying:

casing : 1275.28 : myid= 0 ; [x,y,z]=[ 3.34E+00 , 9.43E+00 , -1.58E+00 ]; gBn= -1.6503E-01 , err= 2.E+01 ; ifail= 1 ; min/max calls= 8 195 ; maxpts too smal;
casing : 1275.28 : myid= 0 ; [x,y,z]=[ 3.34E+00 , 9.43E+00 , -1.58E+00 ]; gBn= -1.6503E-01 , err= 2.E+01 ; ifail= 11 ; min/max calls= 195 390 ; NW is too small;

and the code still runs but the normal field on the computational boundary does not seem to converge while performing the Piccard iterations.

Any clue?

jloizu commented 6 years ago

I have created a branch verifcasing in order to address this issue by testing a very simple case of a "toroidal plasma wire". The purpose is to debug and verify the virtual casing calculation in SPEC.

See InputFiles/Verification/forcefree/wire for more information.

zhisong commented 6 years ago

The segmentation error appears because the work array rwk is corrupted in the integration subroutine. The problem can simply be solved by setting irestart=0. It seems that DCUHRE reuses some information from the work array if irestart==1. However, since the work array is completely reallocated, the attempt will fail and hence the error.

jloizu commented 6 years ago

What is strange is that, in DCUHRE documentation, it says (RESTAR=irestart)

If RESTAR = 1, then we restart a previous attempt. In this case the only parameters for DCUHRE that may be changed (with respect to the previous call of DCUHRE) are MINPTS, MAXPTS, EPSABS, EPSREL and RESTAR.

which means that we should not be reallocating (resizing) the working array rwk, but then DCUHRE gives the error "NW too small"...

lazersos commented 6 years ago

I think the initialization of MAXPTS is too small. If the code is set with NDIM=2 and KEY=0 then it uses 65 points per sub-interval. In the code the following is set: rr = 2**Ndim + 2 * Ndim**2 + 2 * Ndim + 1 maxcalls = max( mincalls, rr ) but this is WAY too small. In virtual_casing_mod.f90 in LIBSTELL we use the following deffinitions: IWRK = 16777216 wrklen = ((IWRK-ndim)/(2*ndim) + 1)*(2*ndim+2*nfun+2) + 17*nfun + 256 I think the confusion here is about the difference between the work array and MAXPTS . You can change MAXPTS between calls but you cannot change the size of the work array. Basically the work array is reserving space in memory. It's kinda of a strange way to deal with things. The size of the work array is fixed to admit a 'largest' value of MAXPTS. You then can run/restart the algorithm so long as the size of the work array does not change.

Based on the way casing.h is written then you can't restart since you change the size of the work array.

jloizu commented 6 years ago

Yes, but previously, when not changing the size of the working array, maxpts was increased when too small, and as this was done, DCUHRE complained in the next iteration by saying "NW too small"...but NW should not be changed!

jloizu commented 6 years ago

If I set maxpts = 3120 from the start, then DCUHRE does not complain. This number however depends on the vcasingtol parameter, which is already quite high (1e-2), so it is not clear to me yet how to deal with this in general.

jloizu commented 6 years ago

OK I think now I understood...

Basically one can change maxpts by some small amount and restart the call to DCUHRE with no problem, but if maxpts is, e.g., doubled, then DCUHRE cannot be restarted because the allocated working array is too small. So, since we have no idea of how large should maxpts be (since it depends, in particular, on the desired tolerance), we must allow for a possible substantial increase of maxpts. So, we need to resize/reallocate the working array (as I had implemented) but consider (as Zhisong suggested) that this is not a restart, i.e. setting irestart=0 (which means DCUHRE sort of starts from scratch with maxpts=2*maxpts).

This works. It is still to be seen if the bnormal obtained makes sense. I am going to check this now.

lazersos commented 6 years ago

@jloizu Yes. The size of the work array must always be set to be >= MAXPTS . If you set the work array size based on MAXPTS then you cannot restart because you cannot increase MAXPTS any further. You need to set the size of the work array to a much larger number. The limits are: MAXPTS>=3*NUM Where NUM in this case is 65. However, NW should be chosen to be much much greater than 3MAXPTS. Set MAXPTS=16777216 and calculate NW based on their formula and you should be fine for nearly arbitrary resolution: NUMFUN=1 NDIM=2 NUM=65 `MAXSUB = (MAXPTS-NUM)/(2NUM) + 1 NW>=MAXSUB(2NDIM+2NUMFUN+2) + 17NUMFUN + 1` However, if you hit MAXPTS evaluations you won't be able to restart because you'll have used up the space in your work array. Remember NW can be arbitrary large but MAXPTS cannot exceed a value specified by NW.

jloizu commented 6 years ago

We are almost there, I think...with the current version of the branch verifcasing, SPEC now runs without DCUHRE complaining regardless of the desired tolerance, and the resulting Bn harmonics compare much better to the semi-analytical formulas:

from SPEC, the Bns components are

m=0 0 m=1 0.004050826199529 m=2 0.002057986532810 m=3 0.000638640352965 m=4 0.000333154197053 m=5 0.000085197510155 m=6 0.000051720771111

from the theory, the Bns components are

m=0 0 m=1 0.038761636918969 m=2 0.019457777712520 m=3 0.006096243065616 m=4 0.003064264387890 m=5 0.000798721875985 m=6 0.000453231597973

namely a factor of about 10 between the two solutions.

zhisong commented 6 years ago

Stuart is right, we should change R(1) and Z(1) to the outer surface of the plasma volume. After doing so, the difference can be narrowed down to a factor of two.

zhisong commented 6 years ago

OK, problem solved. One should use the singular coordinate representation since it is the most inner volume. After applying this, SPEC and theory compares very well.

from spec m=0 0 m=1 3.8561223035750332E-002 m=2 1.9380836794440151E-002 m=3 6.0216706496999881E-003 m=4 3.0471242566654602E-003 m=5 7.8381026961210783E-004 m=6 4.5183946128742281E-004

from the theory, the Bns components are

m=0 0 m=1 0.038761636918969 m=2 0.019457777712520 m=3 0.006096243065616 m=4 0.003064264387890 m=5 0.000798721875985 m=6 0.000453231597973

If we further reduce the radius of the wire, SPEC is getting closer to the analytic result. The error messages popped by DCUHRE is both annoying and time wasting. I think we may implement what @lazersos is suggesting. @jloizu please check this, and see if we can conclude casing.h is doing the right thing. I have pushed the modified casing.h.

jloizu commented 6 years ago

Great! Good team work! I will check all this - I am currently traveling to germany and have reduced internet connection, but will be back in a couple of days.

jloizu commented 5 years ago

OK I checked that indeed now the virtual casing calculation of Bns is in agreement with the expected values from a toroidal wire in vacuum. Also, maxpts is now initialized large enough that warning signs from dcuhre should not appear (at least in well-behaved cases) and dcuhre will not waste time increasing maxpts several times.

I also retrieved the situation in which the possible current sheet at the plasma-vacuum interface is not omitted in the virtual casing integrand. In this case also the SPEC values for Bns agree well with the expected values.

I still want to check that the agreement gets better with decreasing the size of the plasma. And I am going to check if setting Lfindzero=2 (instead of 0) works (i.e. if there is convergence of the force-balance).

jloizu commented 5 years ago

You can find, in InputFiles/Verification/forcefree/wire/, a convergence scan of error versus plasma size.

I am closing now this issue. We can then discuss whether the virtual casing is to be considered verified or needs further testing.

PrincetonUniversity / SPEC

Virtual casing de-nagged routine shows problems #50