Open JunjieHu opened 9 years ago
You need read a whole line by "get", but you can write a shorter line in the block with "updateBatch" function.
The only way to reduce the number of unneeded element is to store each block as a line.
Let me make it more clear. I mean that all the data in table A should be used in each iteration. For example,
void runnable(int localThreadId){
// I need to use all the value in table A to calculate the updated value of A
wtw = v' * A * v
//**here, I need to use all the data in A by get() lines by lines, which is time-consuming.
// different with MF, MF just needs to get L(i) and R(j) instead of the whole matrix
// perform update to block A(i,j) A(i,j) = A(i,j) + beta*wtw }
I think you should avoid to update the whole table A in a single runnable function.
Suppose I have a table A stored in the PS table. I partition the table A into n x m blocks, where A(i,j) denotes the (i,j)-th block by rows and columns.
Each time I need to update just ONE block of the table A in each thread. Suppose a thread needs to update the block A(i,j), but it needs to use ALL the elements in A to calculate the update of block A(i,j). Thus, It needs to read every line of A, which I think would slow down the program. Do you have any idea to avoid reading all the data in A?
Different with Matrix Factorization: To update the i-th row of L and j-th column of R for each thread, MF doesn't need to use all the data in L and R table.
-Junjie