Open wjmaddox opened 5 years ago
You should use a ProductStructureKernel
rather than a ProductKernel
. I think it should be more memory efficient.
If possible I'd actually like to have separate parameters for the dimensions of the kernel. This seems to rule out using ProductStructureKernel
which in my understanding shares parameters across dimensions.
You can have different parameters for different dimensions by setting ard_num_dims
on the base kernel.
Thanks, I'll look into trying that for my use case.
I seem to also be getting memory errors on the CPU at a grid size of 250 when switching out the covariance module for the following:
self.covar_module = gpytorch.kernels.ProductStructureKernel( gpytorch.kernels.GridKernel( gpytorch.kernels.RBFKernel(ard_num_dims = 2), grid = train_x), num_dims = 2 )
Stack trace
~/Documents/Code/gpytorch/gpytorch/kernels/kernel.py in _sq_dist(self, x1, x2, postprocess, x1_eq_x2)
38 x1_ = torch.cat([-2. * x1, x1_norm, x1_pad], dim=-1)
39 x2_ = torch.cat([x2, x2_pad, x2_norm], dim=-1)
---> 40 res = x1_.matmul(x2_.transpose(-2, -1))
41
42 if x1_eq_x2:
RuntimeError: [enforce fail at CPUAllocator.cpp:56] posix_memalign(&data, gAlignment, nbytes) == 0. 12 vs 0
🐛 Bug
Dealing with an issue where the memory usage on the CPU is relatively small, but GPU memory usage seems considerably bigger. It seems to be worse due to the relatively large number of trace samples I need in my application.
To reproduce
Code snippet to reproduce
Stack trace/error message
Expected Behavior
https://github.com/cornellius-gp/gpytorch/blob/4c71e54f519fe1c51db8d0609c5a88532039f17b/gpytorch/lazy/mul_lazy_tensor.py#L49
has a direct evaluation for root tensor. This creates a much larger matrix than is desired in the following line (in the example 90k x 10100), but it's possible this is necessary for evaluation.
System information
Please complete the following information:
Additional context
A possible resolution is to instead call GridInterpolationKernel(ProductKernel(RBFKernel() ) ).