distribution dist_5d units and tools

rui-coelho commented 9 months ago

I think there is something fishy with the dict_5d units and tools to integrate/plot.

d = a5.data.active.getdist("5d") In [44]: d.distribution().units Out[44]: particles*s/(degree*e*kg**2*m**4)

I may understand the degree since it stands for the phi coordinate (let alone jacobian issues this poses when then integrating over (R,Z) since i suspect integrating over Phi for you is just multiply by 2Pi if the distribution is axisymmetric...).
But i don't understand what the electron charge e is doing there. At most, since you are using the particle momentum (and not the velocity) for the velocity phase space, i could expect an atomic number but the effective mass should be already included in the value of the distribution so even this is a bit weird.....
When we integrate the d.distribution() over the PHI, PPAR, PPER, CHARGE or over only PHI, PPAR, PPER we get the same result which doesn't make sense unless the charge is actually the charge number (Z=1 in my case) but again i don't get it why it is part of the units of the distribution in the first place...
The units are very much different from what ASCOT4 uses since when i grab a HDF5 output file from a run i was given in the past by Pietro/Matteo i see that the units are s^2/m^5 which makes much more sense to me. They are the units of the 6d distribution BUT already multiplied by the Vperp units resulting from the integration to go from 3d velocity to Vpar and Vperp. But the units in ASCOT5, using the same logic, if i would use P instead of V should be, unless i'm wrong, s^2/(Kg^2*m^5).
Using the very same post processing script to analyse the HDF5 (using h5py and not the toolset you developed), incidentally, the distribution function over (R,Z) and over (Vpar,Vperp) comes at values which are of the order (electron charge / proton mass) higher that the values i obtained from the old ASCOT4 data.

So....enlighten me if i'm clearly missing something.... )-:

miekkasarki commented 9 months ago

Ok, let's start!

The units are those because

ASCOT5 doesn't actually calculate the distribution function but it collects histograms (and these are in units of "particles").
To convert a histogram to a distribution (function) we divide the histograms by the (phase-space) volume of each bin.

And now to your points:

As you correctly deduced, 1/deg, is there for this very reason that the toroidal angle is one of the abscissae.
Charge and time are also abscissae, so we get units 1/e and 1/s in the distribution.
Note that e is in fact an unit here, not a constant, and the charge abscissa are also given in units of e. So if you have only a single bin in charge abscissa, and the bin width is 1 e, then the only difference in those two integrations should be that we loose 1/e from the units.
If you take a fresh distribution with getdist, and integrate it over phi, charge, and time, you should get s^2/(Kg^2*m^5), right?
Hmm shouldn't the difference be proton mass^2? To be honest, I've long forgotten what exactly ASCOT4 stores but I don't see why there should be an elementary charge or where the the other proton mass disappeared.

rui-coelho commented 9 months ago

In my script using h5py what i assume is the distribution is the ordinate field inside the results object, either from ASCOT4 or ASCOT5.....However i have a "slight issue": using h5py i don't have access to the phase-space volume since the keys are rather limited i.e.

In [20]: f['results']['run_1468060384']['dist5d'].keys()
Out[20]: <KeysViewHDF5 ['abscissa_nbin_01', 'abscissa_nbin_02', 'abscissa_nbin_03', 'abscissa_nbin_04', 'abscissa_nbin_05', 'abscissa_nbin_06', 'abscissa_nbin_07', 'abscissa_ndim', 'abscissa_vec_01', 'abscissa_vec_02', 'abscissa_vec_03', 'abscissa_vec_04', 'abscissa_vec_05', 'abscissa_vec_06', 'abscissa_vec_07', 'ordinate', 'ordinate_ndim']>

...is this "phase-space volume" a method of the object and thus not a dataset stored in the actual HDF5 ? In my run i get that the phasespacevolume of all bins is the same i.e. 2.96742084e-46......i always thought it was the "density" of markers which should reasonable match on all bins (?)

Integrating a 5d distribution from getdist over phi, charge, and time i get

In [11]: dist4d.distribution().units
Out[11]: particles*s**2/(kg**2*m**4)
In [12]: dist4d.histogram().units
Out[12]: particles
In [22]: dist4d.phasespacevolume().units
Out[22]: kg**2*m**4/s**2

So...it is a bit troublesome that i don't see the m**5. The fact that the phasespacevolume comes exactly in the inverse units only tells me that...OK...the distribution multiplied by the phasespace volume has units of particles....as it should....

Now for the actual results of the plots of the ordinate of the dist5d structure stored inside the HDF5. To get the (R,Z) and (Vpar,Vperp) distributions i just do

  # ACHTUNG: ordi comes in shape (nR,nZ,nvpa,nvpe) !!!!
  for i in range(0,nvpa):
    for j in range(0,nvpe):
      fVparVper[i,j] += np.sum(np.multiply(np.sum(ordi[:,:,i,j],1),2*np.pi*R))*dR*dz
  for i in range(0,nR):
    for j in range(0,nz):
      fRz[i,j] += np.sum(np.sum(ordi[i,j,:,:],0))*dvpa*dvpe

I attach the results of the ASCOT4 and the ASCOT5 for the RZ distribution. no way i can squeeze the Deuterium mass squared to match the two in magnitude, let alone the qualitative differences. ASCOT_Fdist_RZ.pdf ASCOT5_Fdist_RZ_200k_pietro_like.pdf

Also, the total number of particles is roughly 1/4 using ASCOT5 when compared to using ASCOT4. I assume i used as close as possible the same kinetic profiles and equilibrium (same eqdsk i assume). The BBNBI5 results "reasonably" match to the BBNBI ones if i divide the divergence by sqrt(2) as i wrote in another thread. Below the values i find first using ASCOT5 and then below for ASCOT4

From the integration of the dist5d:

In [32]: dist4d=d.integrate(time=np.s_[:], phi=np.s_[:], charge=np.s_[:],copy=True)

In [33]: dist4d.integrate(r=np.s_[:], z=np.s_[:], ppar=np.s_[:], pperp=np.s_[:],copy=True).histogram()
Out[33]: unyt_quantity(2.6896424e+19, 'particles')

similar value with

In [54]: np.sum(np.multiply(a5.data.active.getstate('weight',mode='gc',state='end'),a5.data.active.getstate('time',mode='gc',state='end')))
Out[54]: unyt_quantity(2.68994649e+19, '(dimensionless)')

From the end state of ASCOT4 HDF5:

169 Number of particles = 1.1046668274248273e+20

miekkasarki commented 9 months ago

Yes the phase space volume is calculated from the abscissae (since it is just outer producet of all abscissae vectors) in post-processing. All elements have the same volume since we have dull rectilinear coordinates.

Ok so dist4d needs to be divided with the toroidal volume (2pi R) to get the units right.

How dvpa and dvpe in this "ACHTUNG script" are calculated? Do they include units?

rui-coelho commented 9 months ago

  #To get the abcissae in Vparallel and Vperp i need to use a mass
  #so i either take the average or a the first one in the array...
  m=np.mean(m).reshape((1))
  amu=1.66053907e-27
  m*=amu

  R_edges = rzVDist['abscissa_vec_01'][:]
  phi_edges = rzVDist['abscissa_vec_02'][:]
  z_edges = rzVDist['abscissa_vec_03'][:]
  ppa_edges = rzVDist['abscissa_vec_04'][:]
  ppe_edges = rzVDist['abscissa_vec_05'][:]
  time_edges = rzVDist['abscissa_vec_06'][:]
  charge_edges = rzVDist['abscissa_vec_07'][:]

  dR = R_edges[1] - R_edges[0]
  dz = z_edges[1] - z_edges[0]
  dppa = ppa_edges[1] - ppa_edges[0]
  dppe = ppe_edges[1] - ppe_edges[0]

  R = R_edges[0:-1] + dR/2
  z = z_edges[0:-1] + dz/2
  vpa = (ppa_edges[0:-1] + dppa/2)/m
  vpe = (ppe_edges[0:-1] + dppe/2)/m

  ordi = rzVDist['ordinate']

  ASCOT={'R':R,'z':z,'vpa':vpa,'vpe':vpe,
         'dR':dR,'dz':dz,'dvpa':dppa/m,'dvpe':dppe/m,
         'nspecies':N_species,'mass':m,'Fdist':ordi}
  return ASCOT

rui-coelho commented 9 months ago

Note that i revised the other longer comment above since there was info missing......

rui-coelho commented 9 months ago

Yes the phase space volume is calculated from the abscissae (since it is just outer producet of all abscissae vectors) in post-processing. All elements have the same volume since we have dull rectilinear coordinates.

Ok so `dist4d needs to be divided with the toroidal volume (2pi R) to get the units right.

I'd say so yes if we want the traditional distribution function without any jacobian (except the factor from the change to Vpar and Vperp space). Then the method integrate when using the phi coordinate would multiply the distribution inside by 2pi*R before doing the actual integration so that we could raise the units by one m...

rui-coelho commented 9 months ago

For the record, on the total number of particles, it was my bad. I had assigned 1MW of power to the beam units but the PNBI setting files i read from actually contain two beam units, each unit with two groups, each group with 1MW......so it makes 4MW per "cluster" of units i read from file......similarly for the N-NBI cluster, the file contains the 2 units, each units with 5 group....so totals 10MW...

miekkasarki commented 9 months ago

Do you have an updated figure where this factor of 4 and the factor of 2 pi R are included? Those won't explain completly the difference between ASCOT4 and 5, but could help narrow it down.

rui-coelho commented 9 months ago

not yet since the GW is down for jobs. But it does make sense the different as a factor of 4 that i observed. Note that the method of counting from the endstate should not be biased with the 2piR since it comes from the markers themselves and the time span of each i.e.

np.sum(np.multiply(a5.data.active.getstate('weight',mode='gc',state='end'),a5.data.active.getstate('time',mode='gc',state='end')))

I mean, i assume it is correct. Note that the 2piR factor is surely included in your distribution (i hope) and this is why the units have m-4 and not m-5. Right now when you integrate over phi i guess that the only thing you do is multiply by 2Pi.

rui-coelho commented 9 months ago

I also tried to use the integration methods you devised for the objects to get the equivalent (R,Z) "distribution" and "oddly" enough i get some more reasonable order of magnitude as in ASCOT4 ASCOT5_Fdist_RZ_integrated_tools.pdf ASCOT5_Fdist_RZ_400k.pdf

Note that using h5py to read directly from the HDF5 i am not multiplying the "integrated" distribution ( integral of ordi = rzVDist['ordinate']) by the mass squared. So, i wonder what do you actually do in the integrate method.....

miekkasarki commented 9 months ago

The distribution functions are defined in a5py/ascot5io/dist.py and the exact mathematical operation is on this line: https://github.com/ascot4fusion/ascot5/blob/08d8cb700e8eda4d65bd564f0fb4f6107bbfa6d6/a5py/ascot5io/dist.py#L222

That plot looks good but its missing the 2piR factor. You can get the density (in 1/m**3) with built-in tools:

dist = a5.data.active.getdist("5d")
mom2d = a5.data.active.getdist_moments(dist, "density")
mom2d.ordinate("density", toravg=True);
mom2d.plot("density")

How does that compare to ASCOT4 plot?

rui-coelho commented 9 months ago

ASCOT_Fdist_RZ_moments_density_toravg.pdf

It clearly misses the 2piR factor. If you multiply it you get same order of magnitude as the one i did using the integrate method and then plot the "distribution" i.e.

dist = a5.data.active.getdist("5d")
dist_RZ=dist.integrate(phi=np.s_[:], ppar=np.s_[:], pperp=np.s_[:], charge=np.s_[:], time=np.s_[:],copy=True)

rui-coelho commented 9 months ago

The distribution functions are defined in a5py/ascot5io/dist.py and the exact mathematical operation is on this line:

https://github.com/ascot4fusion/ascot5/blob/08d8cb700e8eda4d65bd564f0fb4f6107bbfa6d6/a5py/ascot5io/dist.py#L222

Then i must be blind or asleep. You average the d.distribution() (which has units particles*s/(degree*e*kg**2*m**4)) with the "weights" being the size of each dimension cell so this means for the ppar and pperp this includes D mass squared. I'm perfectly fine with this integration... But...how does this distribution relates to what is actually written in the ordinate data in the HDF5 ?!

miekkasarki commented 9 months ago

Ok that plot seems pretty similar to ASCOT4 density, except for a factor of ~4. Or is this data still for the case with the wrong number of particles (or power)?

The field _distribution in the object is the ordinate from the HDF5 which is divided by the phase-space volume dr x dphi x dz x dppar x dpper x dcharge x dtime of each cell. In other words, if the data has not been manipulated in any way then the method dist.histogram() returns data that matches ordinate exactly.

Feeling enlightened already or even more confused? 😅

rui-coelho commented 9 months ago

Well, this might help. I just added up all the elements in the ordinate dataset of the HDF5 and added up all the elements in the d.distribution() that i read using your methods (mind the units are of course wrong since i only add terms and not integrate)....

In [24]: np.sum(dist.distribution())
Out[24]: unyt_quantity(4.84501131e+62, 'particles*s/(degree*e*kg**2*m**4)')

In [25]: np.sum(f['results']['run_2076535896']['dist5d']['ordinate'])
Out[25]: 2.683128242279483e+19

Now, incidentally.....the 2.683128242279483e+19 is precisely (sort of) what i get from the end state !

In [26]: np.sum(f['results']['run_2076535896']['endstate']['time'][:]*f['results']['run_2076535896']['endstate']['weight'][:])
Out[26]: 2.6833454892573434e+19

So....the ordinate being written on the HDF5 dataset doesn't make much sense since the simple SUM of all the array elements (7-d array, 3 spatial, 2 velocity, charge and time) should not have dimension of #particles !!!

miekkasarki commented 9 months ago

That's precisely what the ordinate is in ASCOT5! Remember, we keep the code as stupid as possible. When the distribution is updated during the simulation, the code adds weight * timestep into the cell that the particle is currently in. At the end of the simulation, that data is then just simply written into the file as is without any manipulation.

rui-coelho commented 9 months ago

what a stupid code.........so to reverse engineer i should grab the ordinate (which contains the number of particles at each cell in 7d phase space) and...........nooooo.........

i'll try that latter but this should be clearly written somewhere in the documentation.....a dataset using the same ordinate field should have the same units which is clearly not the case....

miekkasarki commented 9 months ago

The contents of the HDF5 are supposed to be for the developers´ eyes only, so the documentation on what exactly is stored there is in the source code and somewhere in here https://ascot4fusion.github.io/ascot5/_static/doxygen/index.html

rui-coelho commented 9 months ago

aia.....the HDF5 is the primordial, the cream and holy data repository. Of course i can use the methods of the a5py class but i need to have that python module accessible somehow....which might not always be the case.

ascot4fusion / ascot5

distribution dist_5d units and tools #55