Closed wlai0611 closed 1 month ago
@mfoerste4 can you comment how much work this extension would be?
It seems that there is support for this in the cusolver API. It should be a minor effort to pass through the parameter and correct dimensions.
Put the task on your backlog.
Software versions
Legate : 24.05.00.dev+181.g1c5e17e1 Cunumeric : 24.05.00.dev+38.gb87dd7db Numpy : 2.0.0 Scipy : 1.14.0 Numba : (failed to detect) CTK package : cuda-version-12.5-hd4f0392_3 (conda-forge) GPU driver : 535.161.08 GPU devices : GPU 0: Tesla V100-SXM2-32GB [0 - 7f98c7d18740] 0.000056 {4}{threads}: reservation ('dedicated worker (generic) #1') cannot be satisfied
Jupyter notebook / Jupyter Lab version
No response
Expected behavior
I have a possible alternative that can work around this issue but was just wondering if you had any plans to add full_matrices = False to the cunumeric.linalg.svd algorithm.
So in Numpy, if I have a tall thin matrix like 1000 rows by 10 columns, and I perform SVD on the matrix by default, it would return a U matrix of size 1000 rows by 1000 columns which might be too large for memory. But there is an option to specify full_matrices=False, which returns a U matrix of size 1000 rows by 10 columns. And the 10 columns returned are the left singular vectors with the largest magnitude singular values.
Observed behavior
Currently cunumeric.linalg.svd does not provide the full_matrices = False option.
A possible alternative which I will use in the meantime is to project the input matrix (which is tall and thin) onto the Q of the QR decomposition of a few random linear combinations of the input matrix columns. Then feed the projection (which is short and thin) into the cunumeric SVD and truncate the resulting U matrix before un-projecting (as described in this paper ) This can give the U matrix in truncated form without overloading memory.
Example code or instructions
Stack traceback or browser console output
No response