cornellius-gp / gpytorch

A highly efficient implementation of Gaussian Processes in PyTorch
MIT License
3.57k stars 560 forks source link

[Question] Using `GPyTorch` with a distance matrix or covariance matrix as input #1700

Open FanwangM opened 3 years ago

FanwangM commented 3 years ago

Hi! Thanks for providing this nice tool for us!!

My question is it possible to use a symmetric pairwise distance matrix or a covariance matrix from the distance matrix as an input, instead of the conventional used X_train? In the examples given in the documentation, we use use X_train and y_train and the kernel function we select will transform our X_train into a covariance matrix. But in my case, I can only compute the distance matrix.

Any suggestions would be appreciated. Thanks!

wjmaddox commented 3 years ago

What exactly do you mean? Do you mean that you only have access to Cov(X_train, X_train) or that the input (e.g. X_train) itself is a covariance matrix?

FanwangM commented 3 years ago

I mean the Cov(X_train, X_train) only and I don't have the X_train.

@wjmaddox

wjmaddox commented 3 years ago

How do you intend to make predictions then? Do you also have access to Cov(X_train, X_test) as well?

FanwangM commented 3 years ago

Yes, I have Cov(X_train, X_test). Actually, I have pairwise distances which enable me to get the covariance matrix along with kernel functions, such as Gaussian kernel. Cross-validation can be done by slicing the covariance matrix. I also have target values, y_train (a vector of real numbers).

Balandat commented 3 years ago

I'm curious about the reason why you only have the covariances. Is this a really complex kernel that you can't easily implement in gpytorch? Or do you have unlabeled data points only for some other reason (proprietary data, privacy...)?

FanwangM commented 3 years ago

I have an in-house algorithm to get the distance/dissimilarity between two objects. Then I have some objects and I have computed pair-wise distances. This gives me a chance to build a covariance matrix from distance, such as a Matern kernel/covariance. I don't have coordinates of the objects, which makes the problem a little bit hard to use build-in kernels in gpytorch. I don't need a complicated kernel. I am looking for something like this in GPyTorch, https://gist.github.com/amueller/1351047.

Hope this is a little bit helpful. @Balandat

Balandat commented 3 years ago

Cool stuff. It should be possible to do this by using the components of the prediction strategies - take a look here: https://github.com/cornellius-gp/gpytorch/blob/master/gpytorch/models/exact_prediction_strategies.py#L249-L264

FanwangM commented 3 years ago

Thank you so much!! Very nice implementation and looks promising! I will take a closer look at it tomorrow and let you know the updates.

mashiro210 commented 9 months ago

Dear @FanwangM

Actually I would like to do same thing as your question. Did you success implementing GPR model using pre-calculated distance matrix? If you would not mind, I woulud like to see your Python script, please.

FanwangM commented 9 months ago

I tried to dig things out, but didn't get luck. Sorry about that. @mashiro210