dimforge / nalgebra

Linear algebra library for Rust.
https://nalgebra.org
Apache License 2.0
3.93k stars 468 forks source link

Trait bounds for common matrix operations are more restrictive than their definition requires #1044

Open p-e-w opened 2 years ago

p-e-w commented 2 years ago

I'm writing a computer algebra system that uses nalgebra::base::Matrix as a generic matrix/vector type. Matrix coefficients can be arbitrary symbolic expressions, implying that they are not necessarily elements of any pre-determined algebraic structure. I would still like to perform formal operations like matrix multiplication and exponentiation (with integer exponents), as well as compute standard properties like the determinant and trace.

By definition of the matrix product and Leibniz determinant, A × B and det(A) are well-defined as long as coefficients can be added and multiplied, yielding a value of the same type (this can actually be relaxed even further, as to perform formal matrix multiplication, it is not strictly speaking necessary that the product of two coefficients is of the same type as the coefficients themselves, only that the resulting type is summable).

But currently, the nalgebra API is asking a lot more than that:

I'm aware that requiring the coefficients to be elements of certain algebraic structures enables optimizations such as decomposition-based determinant computation, which doesn't work if nothing is known about the coefficients. But formally, determinants, products, etc. are still well-defined, and it would be nice if nalgebra offered methods for computing them.

Formal matrix products and determinants have lots of practical applications, such as solving certain recurrence relations and ODEs, computing characteristic polynomials and resultants, and many, many more.

Andlon commented 2 years ago

Hi @p-e-w, thanks for your input here!

First, let me preface this by saying that my opinion here is not in any way authoritative: I'm a heavy user of nalgebra and a somewhat involved contributor, but that's it.

This is actually very relevant to a discussion we've had several times on Discord. Our current consensus is to move toward more restrictions rather than fewer. Let me elaborate.

One reason being is that many of nalgebra's current trait bounds are somewhat haphazardly put in place because of the current definition. But trait bounds are contracts - changing trait bounds is a breaking change. So trait bounds define not only what the current definition is able to do, but they also define what all possible future implementations are allowed to do without breaking API compatibility.

This has the immediate consequence that the more narrow the trait bounds, the less flexibility we have. Moreover, the more narrow and specialized the trait bounds are, the harder it is to navigate and understand the API, because more traits overall are necessary to describe the API. This is a very real problem: nalgebra's API surface is complex and many users struggle to wrap their head around it.

Another concern is that, even though the specific method at hand does not in principle require some particular functionality, we may want to call some more general method in the implementation that actually requires more trait bounds. For example, a common strategy for implementing operations like matrix multiplication (and in particular we take this approach in nalgebra-sparse) is to forward the Mul impl to a "BLAS-like" API of the form

C <- alpha * A * B + beta * C

So e.g. A * B is obtained by setting alpha == 1 and beta == 0. But we couldn't even call this method if we didn't have T: Zero + One. So this means that, even if computing A * B does not technically require us to be able to produce e.g. T::zero() or T::one(), for keeping implementation complexity down it may very well be very beneficial.

To help with most of these cases, I've been meaning to write up a proposal that would center most of nalgebra's algebraic bounds around a few traits:

This way most of the numeric operations in nalgebra could be written with only one of these bounds, which would both make the API easier to understand, as well as giving us enough flexibility to make changes in the future without breaking changes. This also covers all conventional types you might want to use, including all signed integer types, floating point numbers and possibly also arbitrary-precision arithmetic.

There is a very significant cost to taking things to the most general setting. In my opinion, I think trying to stretch nalgebra to support every possible use case is detrimental to what I believe is its main intention: to support users writing linear algebra code in the conventional meaning of the term: vectors and matrices (+ geometric types) of integers and floating-point numbers.

I'm curious to see what other nalgebra users and contributors might have to say!

sebcrozet commented 2 years ago

I agree with the points made by @Andlon. As far as Mul is concerned, requiring bounds that are less restrictive than Zero + One + ClosedAdd + ClosedMul would make it extremely difficult to implement as this would prevent a fair chunk of code reuse, and could make future optimizations more difficult to apply in a backward-compatible way (as you may guess, matrix multiplication is one of the most performance-sensitive piece of code of nalgebra).

However, as far as determinant is concerned, I agree that we should add another determinant method, say, determinant_cofactor that implements an inefficient determinant calculation that works with more types. The ability to compute determinants for, e.g., integer matrices has been requested many times, and I don’t forsee any maintenance burden caused by adding this.

p-e-w commented 2 years ago

Thanks for responding, @Andlon and @sebcrozet!

I understand and respect the focus on performance and its implications on what trait bounds make sense. But the problem with omitting implementations for formal operations from nalgebra is that it leaves an annoying gap in the Rust library ecosystem where for many use cases, nalgebra is almost, but not quite, the right crate for the job.

Considering nalgebra's comprehensiveness and incredible genericity, it's strange that it doesn't provide some, well, rather basic linear algebra features, like the actual definition of the matrix product and determinant. Compared to that, I would consider some of the operations nalgebra does provide, such as the Hessenberg decomposition, to be rather obscure (though I recognize that people working in machine learning might disagree).

Since Rust cannot handle multiple overlapping implementations of the same trait, another Mul implementation is probably not feasible anyway. Instead, consider adding *_formal methods (i.e. mul_formal, determinant_formal etc.) that perform the operations according to their formal definitions. This sidesteps any impact on other, performance-critical code, and ensures that such potentially inefficient implementations are not used accidentally.

As far as additional maintenance burden is concerned, I honestly just don't see it. Formal matrix operations are trivial to implement, and the correct and future-proof trait bounds should be obvious from their definitions. For example, this is the matrix multiplication code I wrote for my CAS after not finding the same in nalgebra:

            (Product(_, _), Mat(a), Mat(b)) => {
                if a.is_empty() && b.is_empty() {
                    Ok(Matrix(a))
                } else if !a.is_empty() && !b.is_empty() && a.ncols() == b.nrows() {
                    Ok(Matrix(crate::expression::Matrix::from_fn(
                        a.nrows(),
                        b.ncols(),
                        |i, j| {
                            (0..a.ncols())
                                .map(|k| a[(i, k)].clone() * b[(k, j)].clone())
                                .reduce(|a, b| a + b)
                                .unwrap()
                        },
                    )))
                } else {
                    Err(IncompatibleOperands {
                        expression: self.clone(),
                        operand_1: a_original.clone(),
                        operand_2: b_original.clone(),
                    })
                }
            }

As far as I'm concerned, that's all an implementation of mul_formal would need to do. Optimizations don't matter, since full genericity doesn't allow for them anyway. Realistically, I don't see code like that ever changing. And it wouldn't even necessitate writing new test cases, since the ones for the existing mul implementations can simply be reused.

Of course, most applications won't need these formal operations. But it feels rather strange to have a linear algebra library that supports every type of matrix multiplication imaginable, except for the one that you can find in any introductory textbook on linear algebra.

devonhollowood commented 2 years ago

I just ran into this yesterday — as a user, I was definitely surprised that determinants do not work with integer matrices in nalgebra. That said, I've also often found it somewhat tricky to figure out exactly what is required of the element type for various matrix operations. As such, I would be very excited about this proposed change:

To help with most of these cases, I've been meaning to write up a proposal that would center most of nalgebra's algebraic bounds around a few traits:

  • Number: Zero + One + Add + Mul + ...: encodes a "typical" integer, real or complex number, including the existence of an additive inverse but without the assumption that there is a multiplicative inverse.
  • RealField: encodes any real or complex number (so we can have multiplicative inverses).
  • ComplexField: encodes any complex number.

This way most of the numeric operations in nalgebra could be written with only one of these bounds, which would both make the API easier to understand, as well as giving us enough flexibility to make changes in the future without breaking changes.

but in the specific case of determinant(), I think that the trait bound should be Number and not ComplexField.

Andlon commented 2 years ago

but in the specific case of determinant(), I think that the trait bound should be Number and not ComplexField.

The problem here is that computing determinants of general integer matrices is incredibly tricky. The usual way you compute a determinant is to compute some kind of matrix decomposition (LU, Cholesky etc.) and compute the determinant from the factors. This simply does not work with integers, because the decompositions rely on real/rational arithmetic (division, in particular).

As @sebcrozet suggested, one way to compute the determinant of an integer matrix is to use cofactor expansion. However, if I recall correctly, this has complexity O(n^4) (vs. O(n^3) for matrix decomposition). More important, however, I think perhaps intermediate quantities can become so large as to easily cause integer overflow, which would make the method not very useful for anything but perhaps the very smallest of matrices. There are, however, algorithms designed specifically for this purpose, such as the Bareiss algorithm.

Hence, the general determinant() method must assume Real arithmetic in order to be both efficient and reliable.

Andlon commented 2 years ago

As far as additional maintenance burden is concerned, I honestly just don't see it. Formal matrix operations are trivial to implement, and the correct and future-proof trait bounds should be obvious from their definitions.

Personally I'm not necessarily opposed to this. However, it requires someone to do the actual work, and I'm pretty confident in saying that none of the maintainers or frequent contributors has the interest nor bandwidth to spearhead any kind of effort in this direction.

In any case, if we're not changing the Mul operations, it seems to me that this could even be an external library, that would extend nalgebra with methods defined on a Formal trait. I don't see any considerable immediate benefits to having this in nalgebra proper. What do you think?