NVIDIA / warp

A Python framework for high performance GPU simulation and graphics
https://nvidia.github.io/warp/
Other
4.28k stars 243 forks source link

[BUG] Singular value decomposition(warp.svd3()) gives wrong results #281

Closed swan2818 closed 3 months ago

swan2818 commented 4 months ago

Bug Description

warp showed the largest error in the following example:

import numpy as np
import warp as wp
import taichi as ti
import torch

A_mat = np.random.randn(100, 3, 3)/1e4
B_mat = np.identity(3, dtype=np.float64)
A_mat = A_mat + B_mat

U_np, S_np, Vt_np = np.linalg.svd(A_mat, full_matrices=False)
B_mat = U_np @ (S_np[..., None]*Vt_np)

A_wp = wp.array(A_mat, dtype=wp.mat33d, shape=100)
B_wp = wp.array(A_mat, dtype=wp.mat33d, shape=100)

@wp.kernel
def SVD_wp(A:wp.array(dtype=wp.mat33d),
           B:wp.array(dtype=wp.mat33d)):
    p = wp.tid()

    U = wp.mat33d()
    S = wp.vec3d()
    V = wp.mat33d()

    wp.svd3(A[p], U, S, V)
    B[p] = U * wp.diag(S) * wp.transpose(V)

wp.launch(kernel=SVD_wp, dim=100, inputs=[A_wp, B_wp])
A_wp = A_wp.numpy()
B_wp = B_wp.numpy()

ti.init(arch=ti.gpu, default_fp=ti.f64)
A_ti = ti.Matrix.field(n=3, m=3, dtype=ti.f64, shape=100)
B_ti = ti.Matrix.field(n=3, m=3, dtype=ti.f64, shape=100)
A_ti.from_numpy(A_mat)
B_ti.from_numpy(B_mat)

@ti.kernel
def SVD_ti():
    for p in A_ti:
        U, S, V = ti.svd(A_ti[p])
        B_ti[p] = U@S@V.transpose()

SVD_ti()
B_ti = B_ti.to_numpy()
A_ti = A_ti.to_numpy()

A_to = torch.tensor(A_mat)
if torch.cuda.is_available():
    A_to = A_to.to("cuda")
U, S, Vh = torch.linalg.svd(A_to, full_matrices=False)
B_to = U @ (S[..., None]*Vh)
B_to = B_to.cpu()
B_to = B_to.numpy()

e_np = np.linalg.norm(A_mat-B_mat, axis=(1,2))/np.linalg.norm(A_mat, axis=(1,2))
e_wp = np.linalg.norm(A_mat-B_wp, axis=(1,2))/np.linalg.norm(A_mat, axis=(1,2))
e_ti = np.linalg.norm(A_mat-B_ti, axis=(1,2))/np.linalg.norm(A_mat, axis=(1,2))
e_to = np.linalg.norm(A_mat-B_to, axis=(1,2))/np.linalg.norm(A_mat, axis=(1,2))

import matplotlib.pyplot as plt
import scienceplots
plt.style.use(['science','ieee'])

plt.boxplot([e_np, e_wp, e_ti, e_to], labels=['Numpy','Warp', 'Taichi', 'Pytorch'], whis = 1.5)
plt.yscale('log')
plt.ylabel('Error')

plt.ylim(1e-16, 1e-6)
plt.show()

The result is given as:

fp64

The error in matrix decomposition using Warp is 100 times ~ 100,000 times larger than the one in Numpy, Taichi and Pytorch.

Singular value decomposition in Warp refers to the numerical method described in "Computing the Singular Value Decomposition of 3x3 matrices with minimal branching and elementary floating point operations", and especially the algorithm of Fast 3x3 SVD by Ericjang. (https://github.com/ericjang/svd3)

Recently, There was an issue reported about the incorrect Givens rotation in the special case. A manual that can make this algorithm more robust is also suggested. (https://github.com/ericjang/svd3/issues/10)

Using numerically unstable matrix decomposition could ruin the results of physical simulation. I think this problem is fundamental in all kinds of physical simulations since most of them use matrix decomposition and accumulated error could ruin the results.

System Information

No response

gdaviet commented 4 months ago

Hi @swan2818, thanks for reporting this issue. The loss of accuracy seems to come mainly for the use of not-precise-enough numerical constants with fp64 data types (In my tests accuracy of warp svd with fp32 is in the same ballpark as numpy and torch). We'll push a fix shortly

gdaviet commented 3 months ago

Fix has been merged