sslattery / Chimera

Other
1 stars 1 forks source link

ParaSails Preconditioner Analysis for the SPN Equations #30

Open sslattery opened 11 years ago

sslattery commented 11 years ago

ParaSails has been implemented with MCLS for preconditioning. ParaSails is nice for many reasons:

  1. It builds the inverse directly and that inverse has a sparsity pattern controlled by the input operator - giving a sparse composite operator.
  2. There are strong controls over the sparsity of the inverse.
  3. It has demonstrated good scaling and operating on convection-diffusion problems with hundreds of millions of unknowns.
  4. It gives a better preconditioning than Jacobi.
  5. Its pretty fast.
  6. It is a production implementation from Livermore.

I'm going to do a full analysis of the new preconditioner with the 2 Monte Carlo estimators and compare the performance to my other preconditioners.

This is my last preconditioner. Based on the literature this performed the best for build sparse approximate inverses and I can leverage a production implementation. If this doesn't work out then more research is really required for preconditioning these stochastic problems which is beyond the scope of this work.

sslattery commented 11 years ago

This one was pretty disappointing. The preconditioning was barely better than Jacobi and as a result, I couldn't get any MCSA variations to converge with either estimator. By the time you might be able to get this thing to converge, you have enough entries in the composite operator to the point that you might as well use ILUT because it will work a lot better.

This is the last preconditioner I will be trying. The reality is that with the current way the iteration matrix is derived, there is a requirement to build a composite operator that can as result of matrix-matrix multiplication have many matrix entries. These preconditioners might work a lot better if we could lift that restriction and build the iteration matrix in a different way such that we could sample the preconditioner and the operator independently without forming the composite.

Writing the Neumann series as it stands now destroys scalability and uses a lot of memory. Remember, if the 1-group SP1 equations have 7 entires per row and we have to store a composite operator with 2000 entries per row just to get convergence, then we are using a significant amount of resources, way beyond that of any subspace iteration. This problem has to be solved if there is any hope of using these methods for real problems. I am not sure, however, if solving that problem is within the scope of this work.

sslattery commented 11 years ago

Some new work has potentially made this feasible again. I'm opening this back up to do a little more analysis.

sslattery commented 11 years ago

Some good news here: this is converging with the expected value estimator and my bug fix for the SP1-1 group problem with little memory overhead and better timing results for the domain generation. This approach will also likely scale better. I'm going to do another convergence analysis with the expected value estimator using the ParaSails preconditioner to study its properties.

sslattery commented 11 years ago

Here are the results of the analysis for the 1-group SP1 problem 3.In ParaSails there are 3 main parameters to tune: the number of levels, the threshold, and the filter. If the number of levels is N, then the sparsity pattern for the inverse matrix is the same pattern as A^(N+1) where A is the matrix being preconditioned. The threshold is a tolerance used to remove small values from the input matrix and the filter is a tolerance to remove small values from the resulting approximate inverse.

First lets look at the effect on the number of levels in the preconditioner as a function of the number of histories per iteration with a fixed threshold and filter of 0.0:

threshold_0 We see that even with the expected value estimator, this preconditioner results in a less robust method. As the number of levels are increased, we do see an improvement in convergence, getting down to a nice number of iterations even with 4 levels. Monotonicity is an issue here and we do not always converge.

Next, fix the filter at 0.0 and raise the threshold to 0.01:

threshold_01 This definitely speeds up the calculation while maintaining iterative performance. In addition, the use of the threshold permitted 6 levels to be computed in a reasonable amount of time. This could not be achieved with a threshold of 0.

Finally, the threshold was increased to 0.1 with the filter fixed at 0.0:

threshold_001 Performance gets a little worse here and therefore a threshold of 0.01 is the best choice for this problem.

It was found that the filter had little effect on the results and performance and should be left at 0. Overall, these preconditioners were generated much faster than ILUT at the cost of being less robust. With ILUT, I had to invert row-by-row the LU factors provided by Ifpack. In addition, that type of preconditioner is sensitive to parallel decomposition with more iterations required to converge with more cores. I suspect that ParaSails will not have these issues. The resulting composite matrix is still denser than I would like, but far better than ILUT without any type of sparsity approximation applied.

I did notice a large amount of memory being consumed during generation of the Monte Carlo sampling tables (~10x the memory at run time). I'll have to refactor this in a way that uses less memory.

sslattery commented 11 years ago

I looked at the effects of the reduced domain approximation with ParaSails in #31. With that, I can now look at the effects of the relaxation parameters on convergence. Using 7 levels in the sparsity pattern and a threshold of 0.01, the Richardson relaxation was varied while the Neumann relaxation was fixed at 1.0. All other Neumann relaxation values performed worse. Here is the data with a weight recovery of 0 and a domain fill of 175:

relax This shows optimal iterative performance with a relaxation around 1.9, much higher than other methods. In fact, this method even converged, although poorly, with a relaxation parameter of 3. This will likely change based on the problem, but likely a value between 1 and 2 should work well. There was over a 30% reduction in the number of iterations over using 1, so that may not be a good recommendation, although it will always converge if properly preconditioned.