lppcom / opencl-book-samples

Automatically exported from code.google.com/p/opencl-book-samples
0 stars 0 forks source link

chapter 21, wrong array indexes #39

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hello
I was looking at the book that you guys wrote.
In chapter 21, on matrix multiplication, I am afraid but the indexes for 
accessing the array data are wrong. Either you apply row major order or column 
major order. But the way you access does not make sense.
You have for instance A[Ndim][Pdim], accessing to element A[i][j] would be 
A[i*Pdim + j] for row major order and A[j*Ndim + i] for column major order. 
In the book chapter you write 

Given A[Ndim][Pdim], B[Pdim][Mdim],C[Ndim][Mdim]
tmp=0;
for (k=0; k<Pdim; k++) {
// C[i][j]+=A[i][k]*B[k][j]
tmp += *(A + (i*Ndim + k))* *(B + (k*Pdim + j));
}
*(C + (i*Ndim + j)) = tmp;

And it should be if following for instance row major order:
Given A[Ndim][Pdim], B[Pdim][Mdim],C[Ndim][Mdim]
tmp=0;
for (k=0; k<Pdim; k++) {
// C[i][j]+=A[i][k]*B[k][j]
tmp += *(A + (i*Pdim + k))* *(B + (k*Mdim + j));
}
*(C + (i*Mdim + j)) = tmp;

You can for instance do performance improvements by having B copied into local 
space but transposed, ie then it is row major order, so access to B is also 
continuous instead of jumping Mdim at each k.

On the matematical definition of the problem page 500 it reads 
Cij = Cij + sum(k=0 to P) Aik*Bkj
for 0<= i <= N and 0<= j <= M 
It should be 
Cij = sum(k=0 to P-1)Aik*Bkj
for 0<= i < N  and 0<= j < M

Let me know if I am missing anything.

Joshua

Original issue reported on code.google.com by joshua.m...@gmail.com on 12 Sep 2011 at 4:56