ORNL / hykkt

HyKKT is an experimental linear solver for Karush-Kuhn-Tucker type of systems.
Other
14 stars 2 forks source link

Allow different sorting methods for permuted column array's rows #7

Open maksud opened 1 year ago

maksud commented 1 year ago

Currently selection sort is used because the number of entries per row is small. It is not clear if the number of entries being sorted justifies a binary sort, but it could be added as an option. @pelesh

maksud commented 1 year ago

I implemented Quick Sort to replace the selection sort. However, Selection Sort seems to be performing well compared to quick sort due to smaller array size. So, selection sort is suggested for use in this case. @pelesh

maksud commented 1 year ago

I have also implemented Insertion Sort and this seems to improve the performance a bit. This can be a replacement of the Selection Sort. @pelesh

cameronrutherford commented 1 year ago
Here is a collapsed snippet of something a coworker sent through.

suppose we have a sparse matrix of order n (n rows and n columns) in compressed row format (ia,ja,a) that we want to permute rows and columns symmetrically, by a permutation specified by p[0:n-1] with the ith permuted column/row is given by p[i] and we want the elements in the permuted matrix to be in sorted column order. so permuted row i was original row p[i]. (similarly for columns) let ia[0:n] be the row pointers that point to the indices in ja and a that correspond to the first element in the row. ja[k] holds the column index for the matrix value a[k] and the row index is the i such that ia[i] <= k < ia[i+1] remember that ia[n+1] = First we need to compute the row pointers for the permuted matrix (this is only done once as the permutation and nonzero structure of the marix is not changing We will call this iap so we start by setting iap[i] to be the number of elements in the i the row of the permuted matrix, which is the number of elements in row p[i] of the original matrix which is ia[p[i]+1] - ia[p[i]] for (i=0;i

It's been a while since I have looked at it, but the idea is that the matrices that we are operating on are symmetric, so taking the transposition of the matrix should be the fastest way to sort.

I am not convinced that snipped is completely descriptive, but I can try and clarify things with him.

djbaxter commented 1 year ago

Cameron, the algorithm I sent, was used years and years ago in the PCGPAK code out of Scientific Computing Associates, does not need the matrix to be symmetric. It sorts the elements within a row by column number in a Compressed row storage format matrix by transposing it twice. The operation count is 3*nzr + n where nzr is the number of non-zeros and n is the number of rows and columns. So if your average number of non-zero elements per row is 4 or more, it will be faster. Its a fundamentally a bin sort. If the matrix is distributed across nodes it can be done in parallel with a little extra effort. I am happy to clarify if needed.

djbaxter commented 1 year ago

I should correct the op count and mention that the matrix need not be square, if it has m row and n columns, then it takes n operations to zero the column pointers array, nzr operations to count the elements in each column - these are reads, n operations to form the partial sums of the column counts for the column pointers of the compressed columns storage format, nzr - operations to move the elements into the transposed compressed column storage format, n operations to reset the column pointers after using them as indices in doing the tanspose move, m operations to copy the row pointers from the csr format to be used as ptrs in the second transpose, then nzr operations to move the transposed elements back to the untransposed form, leaving them in column sorted order with in each row. That's a total of 3nzr + 3n + m operations. The n operations to recover the column pointers after doing the first transpose can be avoided by adding an extra element to the column pointers vector for an op count of 3nzr + 2n + m operations.

cameronrutherford commented 1 year ago

https://dl.acm.org/doi/pdf/10.1145/355791.355796