dimforge / nalgebra

Linear algebra library for Rust.
https://nalgebra.org
Apache License 2.0
4.05k stars 485 forks source link

There's a potential bug with the Mat4 inverse code. #1432

Open Makogan opened 3 months ago

Makogan commented 3 months ago

Specifically here.

A determinant close to 0 would still cause numerical issues that can result in NAN's.

I found out the hard way with his snippet:

let intermediate = match cost_matrix.try_inverse() {
        None => {
            let half = S::from(0.5).unwrap();
            (edge.v1().data().clone() + edge.v2().data().clone()) * half
        }
        Some(inverse) => {
            let b = Vector4::<S>::new(
                 S::from(0.).unwrap(),
                 S::from(0.).unwrap(),
                 S::from(0.).unwrap(),
                 S::from(1.).unwrap(),
             );
            // Find the point that minimizes the quadric error.
            let sol = inverse * b;

             // Homogenize the coordinates.
             let mut res = VertData::<R::VertContainer>::default();
             res[0] = sol.x / sol.w;
             res[1] = sol.y / sol.w;
             res[2] = sol.z / sol.w;

            res
        }
    };

And this input matrix:

  ┌                                                                                                             ┐
  │        0.00000000027996544   0.0000000000000015543122 -0.00000000000000027755576  -0.0000000000000008881784 │
  │   0.0000000000000015543122            0.0000023153557          -0.00000043290348           -0.0000019843192 │
  │ -0.00000000000000027755576          -0.00000043290348           0.00000008149934           0.00000037111147 │
  │  -0.0000000000000008881784           -0.0000019843192           0.00000037111147            0.0000017006313 │
  └                                                                                                             ┘

Using a heuristic, the determinant of this matrix is about 8.26×10 −38 Which is essentially 0, however since it is not identically 0, the linked code proceeds to compute multiple divisions which yield NAN's and infs, so the resulting matrix is useless.

On it's current form the method is problematic, as a user would expect that if try_inverse succeeds, then the result is sane.

Possible fixes:

1) Add an epsilon parameter to the signature to define the failure case for the determinant, rather than test against 0. 2) Add a post computation check that returns None if any of the entries are NaNs/infs 3) Add a debug_assert that crashes the application if any entires are Nans but does not add overhead for release builds.

vextorspace commented 2 months ago

So this is basically the same as the method taught in school for manual solving of matrices. Should we be using something like QR decomposition with Givens rotations instead which is much more stable under numerical issues that show up from poorly conditioned matrices? Such as Matlab does when you use the solve command? https://en.wikipedia.org/wiki/QR_decomposition

https://en.wikipedia.org/wiki/Gaussian_elimination#Gauss%E2%80%93Jordan_elimination I think is close to what is currently being used. Another benefit would be that Givens rotations are row-independent and therefor can be parallelized easily.

vextorspace commented 2 months ago

although LU factorization is faster, so maybe should be used instead of QR. https://search.brave.com/search?q=how+does+lapack+invert+matrix&source=desktop

or maybe it is up to the user to use a numerically strong method if they need one.