Open nelsonag opened 8 years ago
@paulromano is offering at least a $100 payout if you can come up with a direct method. Needs to work up to 64th order though (ENDF-102 limit).
A little more background reading (this was all mentioned in a comment on #494, but I thought I'd reproduce it here). There was apparently a code called SPATRAN which sampled Legendre expansions using a rejection method. They use a clever bounding function, described here. The bounding function can be improved further; see this paper.
Here's another relevant paper.
I think I already have a method (arbitrary order, high efficiency), but I'm not fully aware of the physics. Is it just pure Legendre or is it associated Legendre?
Just regular Legendre expansion is sufficient
I originally tried to derive a method that would be analytic in the sense that the Faddeeva function is analytic (infinite series that we just truncate once we hit machine precision), but I went for Newton's method.
The idea is that I form two matrices. The first converts Legendre coefficients into coefficients of the form P(mu) = Sum_i c_i x^i. The second converts the same coefficients into the integrals from 0 to s of P(mu). These matrices are then used on Legendre coefficients, and the result is stored beforehand. I will call the resulting vectors icoeff
and coeff
. Both are length order + 1. This preprocessing step is essential for speed.
To perform Newton's method, first we solve the first two terms of the Legendre equation to get an initial guess (higher order initial guesses will help more), then (in python syntax):
xv = np.matrix([x_guess**i for i in range(N+1)])
# Update guess
move = (np.dot(xv, icoeff.T) - xi)/np.dot(xv, coeff.T)
x_guess -= move
I then bound the region between -1.0 and 1.0, and test if abs(move) < 1.0e-15
to escape. Here are the results for the 8th order Legendre terms of U238 at 150 keV for scatter. This run took 38 seconds on my computer in Python.
Not sure it's any use. I imagine rejection sampling would be quicker when close to isotropic and Newton's method better when close to forward peaked delta function. I have the Python script that generated this graph, and a Mathematica script that generated the coefficients if it is of any use.
EDIT: Also, it is worth mentioning that these matrices can be analytically extended to any order. So, if Newton's method does not converge fast enough, we can go to any arbitrary expansion. Given a 64th order polynomial, an exact solution is guaranteed in 64 (or is it 65?) steps given exact arithmetic. However, the numerical stability will almost certainly fail us, and an iterative approach still preferable.
I was hoping for something more elegant than root finding on the CDF, but maybe it's just not possible. I am interested to see though whether it would be more or less costly than rejection for typical cases we might see.
I originally thought it would be more elegant, but there are no closed form solutions for order > 4. There are infinite series solutions, which is what I initially started with, but it appeared that it was converging at a rate so slow as to make nearly any other method faster. I can try one more time at that, but I doubt the method would ever beat newton's.
On the bright side, the above results usually converged to machine precision in 6 steps, which each requires 9 evaluations of (x^n), two dot products (29 multiplies, 28 sums), one divide, and one subtract. Thus, the entire operation could be done in around 500 flops. With perfect memory localization and vectorization, modern architectures could do this in around 125 cycles.
EDIT: I repeated the infinite series formulation. Turns out, I had my mathematica script written wrong. For an 8 term Legendre kernel, a 12 term series gives 6 digits of accuracy over the entire domain. The method is more accurate as mu approaches zero. If I get time tomorrow, I'll write it up.
It has a few downsides. The first is that the matrix is different for any given legendre size. However, the higher order matrices with coefficients set to zero still give the right answer. As such, we may need bands of matrices. A 15-stage matrix for all legendre polynomials [4,10], and a 30 stage matrix for all legendre polynomials [11,25] or something. The second downside is that the computational cost is proportional to matrix size, so it is a tradeoff between accuracy, speed, and gigantic matrices in the source code. The third downside is that it takes Mathematica an insane amount of time to derive the solution. To handle 64-term polynomials, it may take a few days.
However, it appears that this may beat Newton's method depending on how accurate you want to be.
EDIT2: It appears this method has some extreme numerical stability issues, and does not extend trivially to 20th order. I'll work on it passively.
Let me add another requirement: the method, whatever it looks like, should handle negative values of f(x) 'gracefully' after reconstructing from the inputted Legendre polynomials.
Negative values are pretty easy to deal with no matter what algorithm is implemented: That aside, I have a few ideas on how to get away from Newton's method to something even faster. I'm working on them off and on. We'll see if they go anywhere.
I'm not sure what it means to handle negative values. Something really has to be done to the data ahead of time, e.g., NJOY assumes that portion of the PDF is zero. One could imagine other strategies like adjusting the expansion coefficients but preserving the moments of the PDF. Anyway, if there really were a direct method that could sample a Legendre expansion but couldn't handle a nonsensical PDF that is negative, that's fine by me.
The reason why I made that statement about being able to handle negatives is that the main driver right now for the multi-group mode is comparison with deterministic solvers. The deterministic solvers I'm aware of don't do anything internally about negative PDFs, they just absorb the negative fluxes and move on.
@nelsonag scary! How often do you encounter such a thing in deterministic calculations? As far as I'm aware, there are only a few random PDFs in ENDF/B-VII.1 which go negative.
@paulromano very frequently. I have this figure available from my research which shows the P5 PDFs from H-1 as a function of Ein to the group from a 47g structure that Ein is in, if that makes sense. It's a little hard to see details but I think you can get a gist for how many negatives there are.
In time I can make a figure showing how much of a standard scattering matrix is representing a PDF with negatives as well. On Feb 29, 2016 11:44 PM, "Paul Romano" notifications@github.com wrote:
@nelsonag https://github.com/nelsonag scary! How often do you encounter such a thing in deterministic calculations? As far as I'm aware, there are only a few random PDFs in ENDF/B-VII.1 which go negative.
— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/598#issuecomment-190541036.
I (wrongly) assumed attaching an image to an email when replying to the github email produce the same result as including an image in the web interface. Now that I am at a computer I can see that's not the case. So here is the figure I mentioned above.
Now that I have a real keyboard too: this is the data outputted from NDPP for H-1 integrated over the HELIOS 47 groups. That is, this is a plot of f(mu,E_in->g_out) reconstructed from P5 data. You should expect more negative regions as the order decreases to P1 (of course, P0 has no negative regions).
"OpenMG" (lets shorten that more: "OMG") allows the user to sample scattering angles from the provide legendre polynomials described the scattering moment matrices as a non-default option. This is currently done with the typical rejection sampling + rectangular bounding box with the height of the box being dictated by the maximum value of reconstructed f(mu). This is not highly efficient, and there is work out there to incorporate better bounding lines. This issue is about researching and implementing such work.
To get an A+ on this ticket, you should forego rejection sampling and instead come up with a way to invert the Legendre polynomials and sample from the reconstructed distribution directly (hey, you only need to do it up to the 10th order!).