Open dkoslicki opened 4 years ago
lsq_linear appears to be unable to handle small_k > 6
since it tries to allocate a large numpy zero array, and the function provided by alexfields was taking longer than 30 minutes to complete the reconstruction for mock community 1 with large_k = 12
, small_k = 4
, q = 0.1
, const = 1000
. In light of this, I am working on option number 3 using tsnnls to see if it can improve speed.
I have been looking for some ways to possibly supplant the Lawson-Hanson algorithm since some tout better times with other nonnegative least squares algorithms. fnnls appears to be a good candidate, and although it may require some adjustment for sparse input, it is at least implemented in Python. I'll see if I can set up a comparison.
From Slack: Chase Colbert 1 day ago Hi @David Koslicki, over the weekend I tried running tests on the first mock community with larger k sizes. However, even natively, there were memory problems with small_k greater than or equal to 8. I realized that this was because I convert A_k_small to a dense matrix when I pass to MinDivLP . However, it seems that scipy.optimize.nnls is only able to accept dense matrices, so I wonder if we may have to write our own implementation of the Lawson-Hanson algorithm in order to accept larger small_k values. Did you have this trouble in the Python implementation of Quikr?
David Koslicki 23 hours ago Hmmm...., I wondered if that may become a problem. When I implemented Quikr, I did it in two languages: Matlab and Julia, so I hadn't run into that problem before. I see a few options here:
np.__config__
leads to better performanceChase Colbert 20 hours ago Taking a look at the thread you linked for #4, I wonder if the implementation the original poster, alexfields, provided would be better since it follows the Lawson-Hanson algorithm as the scipy.optimize.nnls does. Or could it be that lsq_linear may still give similarly sparse solutions?
David Koslicki < 1 minute ago The sparsity of the solutions is definitely something that needs to be investigated/tested. Probably easiest to do this with lsq_linear since it's a simple drop in and test. The issue with the alexfields solution is that it requires A_dot_A, which will/could be massive in our case (but if it works, it might be ok due to the sparse storage of such A's).