microsoft / cliffordlayers

https://microsoft.github.io/cliffordlayers
MIT License
145 stars 18 forks source link

modify the generation of eye matrix of the function whiten_data #12

Closed liluo2 closed 10 months ago

liluo2 commented 10 months ago

For the line here: https://github.com/microsoft/cliffordlayers/blob/9248979d747d13ec550282e2325a626484cfd753/cliffordlayers/nn/functional/batchnorm.py#L67 In my case, the scale of the feature matrix is up to 1e10 (In my opinion this is a problem but discuss later). When multiply X^T with X in a batched way: cov = torch.matmul(X, X.transpose(-1, -2)) / X.shape[-1], the covariance matrix might be negative-definite. But if I extract the problematic matrix from this batch and multiply them, the matrix is positive-definite (line 70: 'U = torch.linalg.cholesky(cov + eye).mH' can run). So in order to prevent the calculational instability, I modify the eye matrix as follows:

Rationality of choosing the max value as the diagnal value might need to be discussed. Here's the small modification. Thank you for your brilliant work!