Open jpfairbanks opened 10 years ago
The two attached images documents what it takes to parallelize hals. I will also survey if there are existing literature that addresses this.
Ramki.
Add the implementation that find m vectors of size k and n vectors of size k instead of k vectors of size m and n resp. This implementation will start a discussion of parallelization, and streaming with local updates. If we update a vertex v then only the rows of W and columns of H that correspond to neighbors of v will be updated.
In the hals code you have a note that says prevH is there to prevent parfor errors. The real question is whether we can optimize each row of H independently. Do we need to use all of H when computing Hx in this code?
Is this an opportunity to use slackness in the optimization? We could compute all of the Hx in parallel and then update all of the rows at once. What does the literature say about numerical convergence? Maybe we run a few more iterations in order to compensate for the slower convergence. Obviously we can apply the same to W.