Open kevinli1324 opened 1 year ago
Hello just wanted to mention that I am experiencing this exact issue in my testing as well for large datasets past roughly 1500 data samples.
I have been able to fix this problem by manually evaluating the kernel in the forward pass of my GP class and adding noise to retain PSD,
K = covar_x.to_dense();
return MultivariateNormal(mean_x, K + noise* torch.eye(K.shape[0]).to(self.device))
However, this likely scales poorly and looses the benefits of the LazyTensor
abstraction.
hmm @activatedgeek is this because of a change in gpytorch perhaps? Do we need to put version restrictions on it if we have not done so already?
I had a look into it. I think it may be from some implicit assumptions LazyTensors being extendible to batch objects with bs (1) with _unsqueeze_batch in the GPyTorch internals but I'm checking with those folks now. I believe it only triggers for large sample size because CG + SLQ is only used vs Cholesky at sufficiently high problem size, and this unsqueeze_batch is called when constructing the preconditioner to use with CG.
It appears that making the following changes to SquareLattice in bilateral_kernel.py at least get the code to run (reshapes in case x has a batch dimension). (Note that with the dimensional assert statements commented back in, the error is caught in the unsqueeze_batch functionality).
class SquareLazyLattice(LazyTensor):
def __init__(self,x,dkernel=None):
super().__init__(x,dkernel=dkernel)
#assert x.ndim==2, f"No batch (even of size 1) inputs supported, got {x.ndim} with shape {x.shape}"
self.x = x.reshape(*x.shape[-2:])
self.orig_shape = x.shape
self.dkernel=dkernel
def _matmul(self,V):
#assert V.ndim<=2
out = LatticeFilterGeneral.apply(V.reshape(*V.shape[-2:]),self.x,self.dkernel)
return out.reshape(V.shape) # unflatten if there were batch axes
def _size(self):
return torch.Size((*self.orig_shape[:-1],self.x.shape[-2]))
def _transpose_nonbatch(self):
return self
def diag(self):
return torch.ones_like(self.x[...,0])
Once we figure out what's really going one we can push out an update. Let me know if this band aid solves the problem for you guys.
Hi! I was trying to use the library and ran into the error that reads "ValueError: left interp size (torch.Size([20000, 1, 1])) is incompatible with base lazy tensor size (torch.Size([20000, 20000])). Make sure the two have the same number of batch dimensions".
This only happens when I run on data with high-sample size/ dimension. I've modified the code in notebooks/bi_gp_ls.ipynb to replicate the error though the error occurs with different kernel settings. Is there an easy way to fix this, or are there extra steps when dealing with larger datasets? Thanks!