Closed DanielTakeshi closed 9 years ago
Hm ... I am looking at MatKernel.cu
, and looking at this:
__global__ void __copyToInds(float *A, float *B, int *I, long long len) {
int tid = threadIdx.x + blockDim.x * (blockIdx.x + gridDim.x * blockIdx.y);
int step = blockDim.x * gridDim.x * gridDim.y;
long long i;
for (i = tid; i < len; i += step) {
B[I[i]] = A[i];
}
}
Shouldn't the loop instead be A[I[i]] = B[i]
? The B
matrix here is the stuff that contains what I'd want to copy into the original GPU matrix. In the above example, B
would have been that matrix (well, row vector to be precise) of zeros.
(Also, to be clear, I'm running this on my Mac 10.9 laptop, and am using the latest version of BIDMat master.)
I think that was the issue. I changed the code to use A[I[i]] = B[i]
instead of B[I[i]] = A[i]
. Now this works for GIMats
and GMats
.
Some test cases with GMats. A similar case holds for GIMats. Note that I also added a Runtime Exception to test for the lengths of these indices, which we really should do to protect the user from doing anything undesirable.
scala> val a = grand(3,12)
a: BIDMat.GMat =
0.40568 0.32611 0.96959 0.79053 0.41202 0.55500 0.22633 0.81040 0.32572 0.96281 0.31198 0.26540
0.59785 0.97801 0.17353 0.40141 0.43692 0.20048 0.96141 0.18022 0.81548 0.86652 0.73854 0.56108
0.11102 0.46033 0.83026 0.91425 0.61166 0.49750 0.12287 0.62431 0.40397 0.011548 0.17791 0.56230
scala> val ii = GIMat(3 on 4 on 5 on 8 on 9 on 10)
ii: BIDMat.GIMat =
3
4
5
..
scala> a(ii)
res2: BIDMat.GMat =
0.32611
0.97801
0.46033
..
scala> a(ii) = gzeros(1,6)
res3: BIDMat.GMat =
0.40568 0 0.96959 0 0.41202 0.55500 0.22633 0.81040 0.32572 0.96281 0.31198 0.26540
0.59785 0 0.17353 0 0.43692 0.20048 0.96141 0.18022 0.81548 0.86652 0.73854 0.56108
0.11102 0 0 0.91425 0.61166 0.49750 0.12287 0.62431 0.40397 0.011548 0.17791 0.56230
scala> a(ii) = gones(1,6)
res4: BIDMat.GMat =
0.40568 1 0.96959 1 0.41202 0.55500 0.22633 0.81040 0.32572 0.96281 0.31198 0.26540
0.59785 1 0.17353 1 0.43692 0.20048 0.96141 0.18022 0.81548 0.86652 0.73854 0.56108
0.11102 1 1 0.91425 0.61166 0.49750 0.12287 0.62431 0.40397 0.011548 0.17791 0.56230
scala> a(ii+10) = 2*gones(1,6)
res5: BIDMat.GMat =
0.40568 1 0.96959 1 0.41202 2 2 0.81040 0.32572 0.96281 0.31198 0.26540
0.59785 1 0.17353 1 2 0.20048 2 0.18022 0.81548 0.86652 0.73854 0.56108
0.11102 1 1 0.91425 2 0.49750 2 0.62431 0.40397 0.011548 0.17791 0.56230
scala> a(ii+10) = 2*gones(1,7)
java.lang.RuntimeException: GMat:updatex error: I and v have unequal lengths 6 and 7, respectively.
at BIDMat.GMat.updatex(GMat.scala:313)
at BIDMat.GMat.update(GMat.scala:236)
... 33 elided
Now the next step is to get this working for GDMats
and GLMats
, because I currently get an unsatisfied link error when just trying to access elements via indices.
Closing this issue because we resolved it for the GIMat
and GMat
cases. I will test the other two cases tonight.
John,
In some code I'm writing, I have a matrix and a set of indices of that matrix (column-major order as usual), called
innz
. I would like to set all of those elements at the spots specified byinnz
to be zero. With CPU matrices, it is straightforward:Alternatively, one can do this:
With GPU matrices, it is a little more complicated because it is missing a few update methods. (I am not sure whether these are on purpose or not; for instance, the Wiki states that the ^* operator is missing but that is deliberate right now.) I am setting another random matrix, and using the same set of indices to target for zeros, but doing the single 0 won't work because of no linear updates. I tried several ways, and by checking the source code, the only way that works is to set the right hand side to be a GIMat, as shown below:
However, this does not modify the components of
ga
!The source code traces back to the
def updatex(I:GIMat, v:GIMat):GIMat
method inGIMat.scala
, which then calls some GPU code:val err = CUMAT.copyToInds(data, v.data, I.data, I.llength);
. I read through theBIDMat/jni/src/BIDMat_CUMAT.cpp
file, which has that function declaration, but the definition isn't there so it must be somewhere else (or maybe it's from CUDA itself, so you didn't write it?). EDIT It's actually in MatKernel.cu, sorry. That seems to be where you wrote the matrix kernels in CUDA. But the definition seems to make sense, based on my rudimentary understanding of CUDA syntax...Regardless, I wanted to check to make sure this was the correct behavior for block updates; it doesn't seem that way to me. If this is not the right way to go, any suggestions on how to do updates to specified values of indices?
Thanks for your time, Daniel