Open AtheMathmo opened 8 years ago
I was thinking about it too. For now I don't think it is needed. I see two alternatives:
I'd been thinking about the first alternative too - this is indeed the most common use case and may be worth exposing. However, I don't think it is a clean solution long term.
I'm glad to hear that you think it isn't needed for now. I've been working on a blog post about in-place matrix transposition and it made me feel like maybe it would be nice to have sooner rather than later.
I had no idea it was that complicated!
However, I don't think it is a clean solution long term.
Why not? It looks like the NOR function to me. It looks more complicated but it is in fact closer to what is actually happening.
I meant the first solution only - in that case I think it would preferable to just support column-major/stride wrapping across all operations instead of having a single exception for transpose multiplication. (In terms of implementation this would also require AT * B and A * BT for all combinations of Matrix
- MatrixSlice
and MatrixSliceMut
.)
The second alternative seems like it may be enough - but I'm not sure that the complexity wouldn't be much more than just supporting column major directly.
It is possible I am misunderstanding what you're suggesting - I'd be very happy if it truly wasn't all that complicated to support these efficiently.
What I did in CSC/CSR sparse matrices is more or less what @tafia said about creating a new struct and swapping cols/rows.
If a new struct is going to be created and the BaseMatrix
trait will be modified, things could look like this:
Anyway, it is just a idea.
This looks great indeed.
Question is to know whether we can easily implement specialization when needed.
The real concern for me is how we handle the decomposition algorithms for both row-major and column-major. I think the only real solution is writing them twice - we can't really generalize over the access patterns.
I agree that @c410-f3r looks good to me but I'm not sure it's something we can put on the road map yet. I think Sparse matrices need some time to settle (after merging of course) before we start taking emphasis off of dense linear algebra, if that makes sense? Either way, thanks for the comment @c410-f3r! I think if we move forward with this we would use a similar approach to what you suggest.
@AtheMathmo: I think with decompositions, you don't need to write the algorithm twice. I can think of two alternatives. In both cases, the decomposition algorithm is written exactly once, for only one of the storage orders.
mat().to_row_major().qr()
, for example.O(n^3)
and copying is basically the fastest O(n^2)
algorithm there is, so the overhead should still be negligible in most cases. In this case, we should still document what storage order the decomposition is optimized for, and perhaps provide a section in the documentation labelled "Performance tips" or so.From a user experience perspective, option 2 is a clear winner. I think ideally we don't want our users to have to worry about storage order unless they have to. In other words, I think worrying about storage order should be opt-in, letting our users focus on the actual linear algebra until they really find themselves in a situation where they truly need to squeeze out maximum performance (which I bet few of our users actually will in the end).
Also, I agree that this would require some pretty drastic changes, and I think we haven't yet learned enough to put it on the road map, as @AtheMathmo points out. However, I think it's great that we get some more contribution to this issue! Thanks for your insights from your work on sparse matrices, @c410-f3r, they are really valuable.
Gulp...
Is this worth doing?
It will allow us to integrate more nicely with BLAS/LAPACK - and we get things like free transposition as a result. We currently have a lot of the framework to do this but will need to make a lot of changes - especially to the decomposition algorithms and other areas optimized for row-major.
Note also that our
iter_rows
usage will become difficult...This issue should track potential pitfalls and criteria. We'll make a new issue to track the actual implementation if it happens.