modify the generation of eye matrix of the function whiten_data

For the line here: https://github.com/microsoft/cliffordlayers/blob/9248979d747d13ec550282e2325a626484cfd753/cliffordlayers/nn/functional/batchnorm.py#L67 In my case, the scale of the feature matrix is up to 1e10 (In my opinion this is a problem but discuss later). When multiply X^T with X in a batched way: cov = torch.matmul(X, X.transpose(-1, -2)) / X.shape[-1], the covariance matrix might be negative-definite. But if I extract the problematic matrix from this batch and multiply them, the matrix is positive-definite (line 70: 'U = torch.linalg.cholesky(cov + eye).mH' can run). So in order to prevent the calculational instability, I modify the eye matrix as follows:

choose the max value of each batch and generate diag matrix with maximum value as diagonal values of each 'batch', named max_values
multiply max_values by 1e-5 (parameter eps) and add them to cov matrix
perform cholesky decomposition

Rationality of choosing the max value as the diagnal value might need to be discussed. Here's the small modification. Thank you for your brilliant work!

microsoft / cliffordlayers

modify the generation of eye matrix of the function whiten_data #12