mksadoughi / Multi-output-Gaussian-Process

MIT License
14 stars 4 forks source link

Multi-output-Gaussian-Process

Multi-output regression

In multi-output regression (multi-target, multi-variate, or multi-response regression), we aim to predict multiple real valued output variables. One simple approach may be using combination of single output regression models. But this approach has some drawbacks and limitations [1]:

To address this drawbacks and limitations, we look for multi-output regression methods that can model the multi-output datasets by considering not only the relationships between the input factors and the corresponding targets but also the relationships between the targets. Several regression methods have been developed for multi-output problems. Click here for a great review of these methods. For example, multi-target SVM or Random forest are two of the most popular ones.

In this research, I am proposing and implementing a new technique for multi-output regression using Gaussian process (GP) model.

Univariate GP

Let’s first start with introducing univariate GP. Univariate GP defines a Gaussian distributions on functions which can be used for nonlinear regression, classification, ranking, preference learning, or ordinal regression. Univariate GP has several advantages over other regression techniques:

  1. It outperforms other methods on the problems with limited by computationally expensive datasets.
  2. It provides not only the mean value of prediction, but also the uncertainty of prediction. The prediction uncertainty can be used to define the confidence level for our prediction accuracy.

PyKrige is a Toolkit for Python which implements the univariate GP model. Also, you can use fitrgp to implement univariate GP model in MATLAB.

Multivariate GP

In this research, I extended the univariate GP to multivariate GP to handle the problems with multiple interdependent targets. The key point for this extension, is redefining the covariance matrix to be able to capture the correlation among the targets. To this end, I used the nonseparable covariance structure explained by Thomas E. Fricker. The conventional univariate GP models, builds the spatial covariance matrix over the features data. Then the kernel’s hyper parameters are learned by maximizing the likelihood of observations. However, in multivariate GP, we extend the covariance matrix to cover the correlation between both features and targets. Therefore, it involves a larger number of hyperparameters to be optimized. As such the optimization process takes a little longer. For the mathematical description of multivariate GP, please see our recent publication [click here].

You can train a multivariate GP model using the MSRM function.

MGP = MRSM(X,Y, option)

Here, MGP is the trained model, X is the matrix of input factors and Y is the matrix of targets. Option can have one or several attributes:

After training the model, we can use predict_resp function for performing the prediction on the new dataset.

[y_ S_] = predict_resp(MGP, X_);

Where X_ is the new dataset. Y_ is mean value of prediction and S_ is the uncertainty of the prediction.

Simple example

Now let’s try to implement a simple example:

  1. Defines a simple input matrix with 4 samples and two features :
    >> X = [1,2; 2, 1; 2, 3; 3, 4]
    X =
     1     2
     2     1
     2     3
     3     4
  2. Define the corresponding output matrix with 4 samples and 2 targets.
    >> Y = [3, 4; 2, 3; 1, 2; 4, 5]
    Y =
     3     4
     2     3
     1     2
     4     5
  3. Build a MRSM model over the data (here I am letting the code uses the default options).
    >> MGP = MRSM (X, Y);
  4. Predict the response at a new input point using the trained model.
    >> X_ = [2.5, 3.5]
    X_ =
    2.5000    3.5000
    >> [Y_, S_] = predict_resp(MGP, X_)
    Y_ =
    2.2442    3.0326
    S_ =
    0.3581    0.4193
    0.4193    1.0432

    Y is the mean value of prediction for the two targets and S is the covariance matrix summarizing the uncertainty of prediction and also the correlation between the two targets at this specific test point. For instance this result shows that the correlation between the two targets at this point is 0.4193.