bgaster / opencl-book-samples

Automatically exported from code.google.com/p/opencl-book-samples
162 stars 108 forks source link

Matrix Multiplication (ch.21) #74

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
In chapter 21, the example proposed is a "multiplication of A times B with the 
result added into C" (p.499)
This function is then presented as C = C + A * B throughout the chapter.

However, it seems to me that the presented code is actually doing C = A * B.

In listing 21.1 (p.500-501) the multiplication A*B is stored in tmp (which is 
initialized to 0.0). tmp is then copied into C(i,j) without taking in count 
it's initial value.
Also, in listing 21.3 (p.504-505) the C matrix is created as a 
CL_MEM_WRITE_ONLY memory object and is obviously not added in the queue before 
launching the kernel.

Original issue reported on code.google.com by thalie.k...@gmail.com on 10 Oct 2012 at 10:10

matejaputic commented 8 years ago

I second this.

The code in Listing 21.2 should read:

const char *C_elem_KernelSource = "\n"\
"__kernel void mmul(const int Mdim                \n"\
"                   const int Ndim,               \n"\
"                   const int Pdim,               \n"\
"                   __global float* A,            \n"\
"                   __global float* B,            \n"\
"                   __global float* C)            \n"\
"{                                                \n"\
"    int k;                                       \n"\
"    int i = get_global_id(0);                    \n"\
"    int j = get_global_id(1);                    \n"\
"    i = i + global_pim_id_0*get_global_size(0);  \n"\
"    j = j + global_pim_id_1*get_global_size(1);  \n"\
"    float tmp;                                   \n"\
"    if ( (i<Ndim) && (j<Mdim)) {                 \n"\
"      tmp = 0.0;                                 \n"\
"      for (k = 0; k < Pdim; k++)                 \n"\
"          tmp += A[i*Ndim+k] *  B[k*Pdim+j];     \n"\
"      C[i*Mdim+j] = tmp;                         \n"\
"    }                                            \n"\
"}                                                \n"\
"\n"

with the only change being that C[i*Ndim+j] should be changed to C[i*Mdim+j]