Multiplication Difference

kumargp commented 6 years ago

I am facing a strange problem that can't be reproduced by standalone test using the same data. For example, the following data gives -3.28023e-10 (in xgcm) instead of 0 (standalone).

1     0.502303        0        
29    0.0379605     0        
35    0.575368      0        
324   0       1.02025e-12      
325   0       1.49842e-10      
382   0       4.40099e-11      
763   0       1.88962e-11      
856   0       4.83222e-12      
857   0       7.23842e-13

cols= 1:column nums of matrix A(sparse); 2:values of matrixA; 3:values of matrixB (dense). All other values are 0's.

But this is not only for either values 0. Another one is:

 705   0.00348886    0    
 795   0.887906    0    
 796   0.0605864   5.62967e-09

All other have A (1st column) 0's. In the 2nd case the result is 8.39821e-09, instead of 3.41081e-10.

Interesting part is I get the values from sparse and dense matrices and do the multiplication in a loop and it gives right answer. But sparseDenseMatMult gives different result. The problem is I am not able to reproduce it using standalone script.

I checked my indices many times . (if you find any in this code that will be a success !). This problem is typical of uninitialized memory access, but I couldn't find any, since the data copied are done for a continuous range, and the result (of loop) matches with original loop multiplication using vectors of maps.


  double *fd_arr = new double[nvtx*npol];
  for(int ivtx=0; ivtx<nvtx; ++ivtx) 
  {
    int ioffset = ivtx*nrhop1*npol + irho*npol;
    for(int ipol=0; ipol<npol; ++ipol) 
    {
      fd_arr[ivtx*npol+ipol] = fd_linear[ioffset+ipol]; //fd_arr is filled continously
    }
  }

  las::Mat *mDens = las::getMatBuilder<las::dense>(0)->create(0, 0, 
      las::createDensity(nvtx, npol, fd_arr), MPI_COMM_SELF);

  auto *dOps = las::getLASOps<las::dense>();
  assert(mDens && dOps);
  las::MatMatMult *multiply = las::getSparseMatDenseMatMult();

  las::Mat *mRes = nullptr;
  multiply->exec(matA, mDens, &mRes);

  assert(nullptr != mRes);
  std::vector <int> rows(nvtx, 0);
  std::vector <int> columns(npol, 0);

  for(int i=0; i< nvtx; ++i)
    rows[i] = i;

  for(int j=0; j<npol; ++j)
     columns[j] = j;

  double *value = nullptr;
  dOps->get(mRes, nvtx, &rows[0], npol, &columns[0], &value);

  //check
    auto *sOps = las::getLASOps<las::sparse>();
    double *valA = nullptr;
    double *valB = nullptr;

    sOps->get(matA, nvtx, &rows[0], nvtx, &rows[0], &valA); 
    dOps->get(mDens, nvtx, &rows[0], npol, &columns[0], &valB);  

    for(int i=0; i<nvtx; ++i) //row of A = row of product
    { 
      for(int ipol=0; ipol<npol; ++ipol) //col of B
      {
        double res = 0;
        for(int j=0; j<nvtx; ++j) //col of A = row of B
        {
            double va = valA[i*nvtx+j];
            double vb = valB[j*npol+ipol];
            double p = va * vb;
            res += p;      
        }
           std::cout <<  res << "  :  " << value[i*npol+ipol]; << std::endl;
      }
    }

wrtobin commented 6 years ago

Without a standalone reproduction this is difficult for me to debug without access to the XCGM fork and branch that the issue is in. If you could provide me the with ability to check out a version of the XCGM code that includes these developments it will be easier to debug. That means either adding me to your personal fork or merging these developments into a branch on the main XCGM project.

kumargp commented 6 years ago

Since I spent lot of energy on matrix multiplication, I decided to dare to look into LAS implementation. This is what I did.

I added a memset in LAS dnsMat, since the exec is adding the initialized data. The + in the first of each loop is a problem.

(z)(xrw,ycl) += (x)(xrw,inr) (y)(inr,ycl);

class dnsMat { int rws; int cls; scalar vls; bool own; public: dnsMat(int rows, int cols, scalar vals = nullptr) : rws(rows) , cls(cols) , vls(vals) , own(false) { if(vls == nullptr) { vls = new scalar[rwscls]; std::memset(vls, 0, sizeof(scalar)rws*cols);

It seems to fix it, need more testing. Anyway I added you in my github, for future helps. I hope you will be able to access it. https://github.com/kumargp/xgc_scorec.git

On Wed, May 16, 2018 at 9:45 AM, William R Tobin notifications@github.com wrote:

Without a standalone reproduction this is difficult for me to debug without access to the XCGM fork and branch that the issue is in. If you could provide me the with ability to check out a version of the XCGM code that includes these developments it will be easier to debug. That means either adding me to your personal fork or merging these developments into a branch on the main XCGM project.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tobinw/las/issues/28#issuecomment-389523933, or mute the thread https://github.com/notifications/unsubscribe-auth/APrdpgjybyfPUWHtknLTXL_r5w3arhc5ks5tzC2KgaJpZM4UAgUg .

wrtobin commented 6 years ago

Try not to respond through the email interface as it prevents us from using github markdown (apparently).

Further it is unfortunately preferable to call dense_ops->zero(Mat) to zero the matrix when necessary rather than always setting the values to 0.0 during construction, since that takes compute time. It is the same reason core Fields and Numberings are not initialized to zero, but can be zero'd directly after allocation if needed.

The allocation of a dense matrix is assumed to be large, so explicitly zeroing all that memory costs process cycles we don't want to spend unless it is necessary. In this case it is, but it should be done through a subsequent call to zero the mat as noted above.

I will add a note to the documentation to make this more explicit.

In the future if you have a fix, commit it to the dev branch (everyone should have push permission on that branch), push it, and submit a merge request for the master branch. Both of us independently implementing the same fix is a waste of development time.

kumargp commented 6 years ago

OK, thanks, I didn't know the problem of email response to github issue. In LAS I am just experimenting, don't want to push the changes, since, as you mentioned there may be other ways of doing it. I will use the Zero() method.

wrtobin / las

Multiplication Difference #28