NaN values for total energy in RAGA

dipto4 commented 3 years ago

Hi, I was performing a simulation involving ~131k particles. It is a two component system with one set of particles have masses 0.1 times the other set of particles. I found that when I used a non-zero value for the coulomb logarithm, I found that the simulation starts reporting NaN values for energy after a few timesteps. Is this behavior expected? If not, how should one go about troubleshooting it?

Thank you.

eugvas commented 3 years ago

well, that's not normal, but something wrong might happen e.g. if the update interval is too large, or some other parameters have unreasonable values - Raga is not as foolproof as it should be.. If you send me the input snapshot and parameter file, I could take a look.

dipto4 commented 3 years ago

Thank you for helping me in such short notice! I noticed that even using a coulomb logarithm of zero spits out NaNs after a few timesteps. I am attaching the initial conditions and the initial parameter file. I am using lmax=2, symmetry=none, episodeLength=0.001. I start getting NaN values for total energy during PotentialUpdate at t=1.588 with coulomb logarithm set to zero. Archive.zip

dipto4 commented 3 years ago

Increasing numSamplesPerEpisode seems to increase the time before the code spits out a NaN.

eugvas commented 3 years ago

ok, found some time to look into this. the immediate reason is that when the potential becomes steep enough near the centre (namely, with density growing faster than r^-2, the potential Phi(0)=-infinity), things begin to break down. The total energy is computed by summing up the particle energies, and adding the potential energy of a central black hole particle (i.e. Mbh Phi(0) ), even if there is no BH - in this case 0 -infinity gives NaN! After correcting this stupid oversight (raga_core.cpp, lines 26-27), energy is finite, but the problem is deeper - the method is not expected to perform well in the case of centrally divergent potentials, at least not when relaxation is included. The strange thing that this gradual deepening of the central potential happens even if I completely turn off the relaxation, which should preserve the system in equilibrium, apart from unavoidable fluctuations caused by a finite particle number. But your snapshots have 128k particles, should be more than enough. Will keep investigating this. Meanwhile, what is the initial model for this simulation?

dipto4 commented 3 years ago

Thank you so much for the insight! The initial model of this simulation was derived from Agama using a Spheroid density profile for both types of particles. For the heavier mass particles the inner density slope (gamma) was set to 1.5 whereas the outer density slope was set to 4. For the lighter particles, the inner slope was set to 0. All of the particles were generated assuming that they had an isotropic velocity distribution. I am attaching the Agama python file and the ini file that were used to generate the initial conditions.

agama_ic_gen.zip

eugvas commented 3 years ago

ok thanks! I noticed that the snapshots produced by this script are scaled differently from the one you attached earlier, but this doesn't matter. By the way, it is not necessary to set the total mass of all particles to 1 and their total energy to -1/4 (see https://github.com/GalacticDynamics-Oxford/Agama/issues/12#issuecomment-932766333 ). I was able to reproduce a similar behaviour - a gradual deepening of the central potential until the inner slope becomes steeper than -2 (which normally shouldn't happen in a self-gravitating system with relaxation, it always has a small core or a Bahcall-Wolf cusp around a central BH). The problem seems to be endemic and could be called "numerical relaxation" if you will - the recomputation of potential is always subject to some discreteness noise from the finite number of particles, and for reasons that are not entirely clear to me, these fluctuations seem to accumulate in a particular direction (deepening the potential rather than randomly wobbling up and down). However, they can be greatly reduced by several countermeasures: 1) enforcing a more symmetric potential. In your case, the initial snapshot is spherical, but for Raga, you set the potential expansion order lmax=2 and symmetry=none. If instead I slash it back to a spherical case (lmax=0), the fluctuations are strongly suppressed. Even with lmax=2 and symmetry=axisymmetric or triaxial, this is still much better than none. Of course, the killer feature of Raga is the ability to work with non-spherical systems, but I always limited this to axisymmetric or triaxial cases. 2) using more "internal samples" stored from each particle's trajectory when recomputing the potential (numSamplesPerEpisode=10 or more) proportionally reduces the flukes. 3) using longer update intervals (episodeLength) - the method really shines when this interval is much longer than dynamical time, but still a small fraction of relaxation time. Even though there is no orbit-averaging involved, as in Hénon-type MC codes, most of the relevant processes occur on relaxation timescales, so it makes sense to update less frequently and thus reduce the impact of discreteness noise. The only exception I could name is when there is a binary SMBH at the centre, which may significantly affect the stellar distribution on just a few dynamical times at early stages of its evolution.

The most recent commit fixes the NaN bug and another hideous bug when using lmax=2, symmetry=triaxial, but otherwise the evolution is identical, and monitoring the total energy conservation and/or the central potential depth in the log file should give some indication as to how sensible are the results.. Energy conservation isn't great compared to traditional N-body codes, but at least is should not be dominated by a steady drift - more than a couple of percent relative error probably means that the results should not be trusted.

dipto4 commented 3 years ago

Thank you so much for the update and the detailed response! This really helped with my simulations. An additional question: should we expect an energy drift if we set a non-zero value of Coulomb logarithm even with spherical symmetry? My eventual goal is to observe mass segregation in these systems for which we would need a non-zero value of Coulomb logarithm.

eugvas commented 3 years ago

I never thoroughly tested the code in the regime when the energy relaxation is important. It may give sensible results, but chances are that the errors accumulate and render it unusable. First, why the energy fluctuations and drift appear at all: unlike Hénon-type codes, where particles do interact in a pairwise way and conserve total energy in each interaction, in Spitzer-type MC codes, each particle gets velocity kicks independently of each other, which are drawn from a probability distribution constructed from the distribution function of the entire snapshot. Thus the total energy budget would have 1/sqrt{N} random fluctuations because of relaxation. For large N (say, 10^5 to 10^6) this should be negligible; but in practice a more serious issue is the subtle asymmetry of positive and negative kicks, which eventually leads to an overall drift in energy rather than just fluctuations. I never got to the bottom of the problem, but there are several possible reasons and all of them wouldn't be easy to fix. In short, if you want to follow the stellar system through core collapse, Raga will almost certainly fail long before. The main niche for the code is when relaxation is rather weak (e.g. a nuclear star cluster with ~10^7 stars, represented by far fewer particles), but there are important physical phenomena on timescales much shorter than relaxation time - primarily related to central single or binary SMBH, i.e. loss-cone and three-body scattering effects. This is where it still shines after many years because of lack of comparably efficient alternatives. Although all applications so far were using single-mass systems, the multi-mass extension has been around for some time, and if you want to explore, e.g., how mass segregation plays with the extended loss cone in weakly collisional triaxial galactic nuclei, this would be a suitable usage scenario. For purely spherical systems (multimass included), I'd say that a far more efficient and reliable alternative is the Fokker-Planck code PhaseFlow (also part of Agama), which has a practically unlimited dynamical range and excellent conservation properties, as well as many orders of magnitude faster.

dipto4 commented 3 years ago

Thank you so much for such a detailed answer! This was extremely helpful. Phaseflow is more suitable for my needs. I just ran a quick Phaseflow simulation and it is certainly more efficient. I noticed that from the tabulated output, I can sample M(r), f(E) to get an N-body snapshot.

eugvas commented 3 years ago

cool! yes, in principle you should be able to convert the output of PhaseFlow into an N-body snapshot. In the case of a single-component system, the easiest way is to run mkspherical.exe density=table.txt out=snap.txt nbody=10000 where table.txt is the output table from phaseflow at some moment of time; mkspherical is built upon the same concept of spherical isotropic systems, and can actually write a very similar table file if you add a command-line argument tab=table1.txt. Although primarily designed to create spherical isotropic equilibrium models with a particular analytic density profile, it can read a text file with two columns - radius and enclosed mass - and turn it into a density profile, that's why the output table can also be used as input. Things would be more complicated in multi-component systems: phaseflow writes each mass component into a separate file, but combining them is not an easy task (basically you want to construct an N-body snapshot out of a density model of a single component, embedded in the potential of all components - this is possible but requires some manipulations outside the command line, e.g. using the Python interface).

dipto4 commented 3 years ago

Looking at the Agama reference file and examples, I assume one way to sample from a multi-component system would be to create a density object from the cumulative mass profile and then create a composite potential from all of the components which would then be passed to the SelfConsistentModel class. Would this method be reasonable?

eugvas commented 3 years ago

actually, for spherical [an]isotropic models you don't need the iterative SelfConsistentModel approach. here is how I would generate an N-body representation of a two-component system, with density profiles taken from PhaseFlow output tables (comp1.txt, comp2.txt) - you may do the same with analytic density profiles, of course.

den1=agama.Density(cumulmass=numpy.loadtxt('comp1.txt')[:,0:2])
den2=agama.Density(cumulmass=numpy.loadtxt('comp2.txt')[:,0:2])
pot=agama.Potential(type='Multipole', density=agama.Density(den1, den2))
df1=agama.DistributionFunction(type='QuasiSpherical', density=den1, potential=pot)
gm1=agama.GalaxyModel(pot, df1)
xv,m=gm1.sample(10000)
agama.writeSnapshot('snap1.txt', (xv,m))

dipto4 commented 3 years ago

Thank you so much for the helpful comments!

GalacticDynamics-Oxford / Agama

NaN values for total energy in RAGA #13