Closed jdmanton closed 10 years ago
Hmm. Is this a case where some kind of pre-allocation might help after all?
Gregory Jefferis
On 5 Sep 2014, at 21:09, James Manton notifications@github.com wrote:
Creating a sparse matrix for 1,000 neurons from the 16,000-neuron full score matrix has been running for more than 90 minutes and still hasn't finished. This is with the full score matrix loaded into memory, so the slowness is not caused by disk access issues.
— Reply to this email directly or view it on GitHub.
Is this a case where some kind of pre-allocation might help after all?
Apparently not...
No pre-allocation:
> system.time(foo <- sparse_score_mat(names(kcs20), allbyallmem))
user system elapsed
198.828 2.110 202.860
Pre-allocation:
> system.time(foo <- sparse_score_mat(names(kcs20), allbyallmem))
user system elapsed
220.063 3.425 229.289
I'll try some other implementations of sparse matrices and, if they're not much better, write one myself that's perhaps not as good for linear algebra but is faster for our use cases.
This is now much improved in 452b5c2, by switching from the Matrix
package to spam
for the sparse matrices.
Just happened to notice this:
https://stat.ethz.ch/pipermail/r-help/2010-December/262365.html
but not much explanation
At least it means that it is due to the library and is not because I've done something silly. Spam seems to be uniformly faster, if somewhat harder to deal with.
Creating a sparse matrix for 1,000 neurons from the 16,000-neuron full score matrix has been running for more than 90 minutes and still hasn't finished. This is with the full score matrix loaded into memory, so the slowness is not caused by disk access issues.