Modifications to computeFSRSources()

meverson commented 10 years ago

Here's my modified version of computeFSRSources. It was originally intended to access only the nonzeros in the scattering matrix for each material but this proved to be too time consuming, oddly enough. This was the reason behind using index, n_mat and nnzs. It is now simplified to only calculate scattering below a cutoff representing the highest incoming group which is scattered. This cutoff is stored in nnzs as the code loops through all outgoing groups.

Also, four different index variables are introduced to reduce the total number of multiplications that have to be done within each loop to retrieve information for various quantities.

Let me know if you have any questions, or see any possible problems with this code. Thanks!

FP_PRECISION CPUSolver::computeFSRSourcesNew() {

int tid;
int flx_ind; 
int sig_ind;
int fg_ind;
int t_ind;
int n_mat;
int nnzs;

FP_PRECISION* nu_sigma_f;
FP_PRECISION* sigma_s;
FP_PRECISION* sigma_t;
FP_PRECISION* chi;
Material* material;

FP_PRECISION scatter_source;
FP_PRECISION fission_source;
FP_PRECISION temp;

FP_PRECISION source_residual = 0.0;
FP_PRECISION inv_k_eff = (1.0 / _k_eff);  // Precalculate inverse of k

std::vector< std::vector<int> > index;
index = _geometry->getEnergyBounds_fg();

FP_PRECISION* cell_res = new FP_PRECISION[_num_FSRs];
FP_PRECISION* group_res = new FP_PRECISION[_num_threads*_num_groups];

/* _scatter_sources was changed from [_num_FSRs*_num_groups] to 
   [_num_threads*_num_groups] since storing for each cell is unnecessary */

/* For all regions, find the source */
#pragma omp parallel for private(tid, flx_ind, sig_ind, fg_ind, \
  t_ind, n_mat, nnzs, material, nu_sigma_f, chi, sigma_s, sigma_t,\
  fission_source, scatter_source, temp) schedule(guided)
for (int r=0; r < _num_FSRs; r++) {

    tid = omp_get_thread_num();
    material = _FSR_materials[r];
nu_sigma_f = material->getNuSigmaF();
chi = material->getChi();
sigma_s = material->getSigmaS();
    sigma_t = material->getSigmaT();
    n_mat = material->getUid();

/* Precalculate indexes only dependendent on cell, thread or material */
flx_ind = r*_num_groups;    // Index to access current cell
fg_ind = n_mat*_num_groups; // Index to access material
t_ind = tid*_num_groups;    // Index to access threaded _scatter_sources and group_res

/* Compute fission source */
if (!(material->isFissionable())) {
    fission_source = 0.0;
}
else {
    for (int g=0; g < _num_groups; g++)
        _fission_sources[flx_ind + g] = _scalar_flux[flx_ind + g] * nu_sigma_f[g];

    fission_source = inv_k_eff * pairwise_sum<FP_PRECISION>(&_fission_sources[flx_ind], 
                        _num_groups);
}

/* Compute total scattering source for group g */
    for (int g=0; g < _num_groups; g++) {

    sig_ind = g*_num_groups; //Index for access cross section for outgoing group g

    nnzs = index[fg_ind+g].back()+1;
    for (int gp = 0; gp < nnzs; gp++) { 
        _scatter_sources[t_ind + gp] = sigma_s[sig_ind + gp]*_scalar_flux[flx_ind + gp];
    }

    scatter_source = pairwise_sum<FP_PRECISION>(&_scatter_sources[t_ind],
                        nnzs);

    /* Set the total source for region r in group G */
    _source[flx_ind + g] = (chi[g]*fission_source + scatter_source) * ONE_OVER_FOUR_PI;

    _reduced_source[flx_ind + g] = _source[flx_ind + g] / sigma_t[g];

    /* Compute the norm of residual of the source in the region, group */
    if ( fabs(_source[flx_ind + g]) > (7E-10/_num_groups) ){
        temp = (_source[flx_ind + g] - _old_source[flx_ind + g])/_source[flx_ind + g];
        group_res[t_ind + g] = temp*temp;
    }

    /* Update the old source */
    _old_source[flx_ind + g] = _source[flx_ind + g];

    }

/* Sum all residuals stored in group_res in current thread */
cell_res[r] = pairwise_sum<FP_PRECISION>(&group_res[t_ind],_num_groups);

}

/* Sum up the residuals from each group and in each region */
source_residual = pairwise_sum<FP_PRECISION>(cell_res,_num_FSRs);
source_residual = sqrt(source_residual / _num_FSRs);

delete [] cell_res;
delete [] group_res;

log_printf(DEBUG,"ALL SOURCES AND RESIDUALS BUILT");

return source_residual;

}

meverson commented 10 years ago

Also, I had to modify the _source cutoff when calculating the residual. When I switched to the 361 group problems I've been working on, many of the sources fell below this even though they contributed to the problem. Therefore, I divided the initial cutoff by the number of groups so this changes with the number of groups accordingly. Seems to have fixed the issue for me.

wbinventor commented 10 years ago

I incorporated some of these changes into the latest commit for CPUSolver and VectorizedSolver, which are the parents of ThreadPrivateSolver and VectorizedPrivateSolver. The GPUSolver class implements a slightly different version of computeFSRSources which leverages the thrust library reduction routine at the expense of a larger memory footprint.

I reduced the size of "_scatter_sources" to "_num_threads * _num_groups" from "_num_FSRs * _num_groups". In addition, I reduced the size of "_source_residuals" to "_num_FSRs" from "_num_FSRs * _num_groups".

As for the the index arithmetic operations, I'm counting on the compiler to recognize this and optimize it for me so I don't have to clutter the code with pre-computing indices.

commit (master): 95a6efebae202f452d0e54df06cbd576901acc1e

mit-crpg / OpenMOC

Modifications to computeFSRSources() #38