In multi-output regression (multi-target, multi-variate, or multi-response regression), we aim to predict multiple real valued output variables. One simple approach may be using combination of single output regression models. But this approach has some drawbacks and limitations [1]:
To address this drawbacks and limitations, we look for multi-output regression methods that can model the multi-output datasets by considering not only the relationships between the input factors and the corresponding targets but also the relationships between the targets. Several regression methods have been developed for multi-output problems. Click here for a great review of these methods. For example, multi-target SVM or Random forest are two of the most popular ones.
In this research, I am proposing and implementing a new technique for multi-output regression using Gaussian process (GP) model.
Let’s first start with introducing univariate GP. Univariate GP defines a Gaussian distributions on functions which can be used for nonlinear regression, classification, ranking, preference learning, or ordinal regression. Univariate GP has several advantages over other regression techniques:
PyKrige is a Toolkit for Python which implements the univariate GP model. Also, you can use fitrgp to implement univariate GP model in MATLAB.
In this research, I extended the univariate GP to multivariate GP to handle the problems with multiple interdependent targets. The key point for this extension, is redefining the covariance matrix to be able to capture the correlation among the targets. To this end, I used the nonseparable covariance structure explained by Thomas E. Fricker. The conventional univariate GP models, builds the spatial covariance matrix over the features data. Then the kernel’s hyper parameters are learned by maximizing the likelihood of observations. However, in multivariate GP, we extend the covariance matrix to cover the correlation between both features and targets. Therefore, it involves a larger number of hyperparameters to be optimized. As such the optimization process takes a little longer. For the mathematical description of multivariate GP, please see our recent publication [click here].
You can train a multivariate GP model using the MSRM function.
MGP = MRSM(X,Y, option)
Here, MGP is the trained model, X
is the matrix of input factors and Y
is the matrix of targets. Option can have one or several attributes:
Option.s
: defines the upper bound for the correlation matrix among the targets. Setting this number to a very small value means we are removing all the possible correlation between the targets in building the multivariate GP model.
Option.degree
: defines the digress of polynomial function which is used as the trend function for the multivariate GP model. It can be any integer between zero to 4. The default value is zero.
Option.optim
: defines the optimization method that is used during the training process. The default is ‘fmincon’
. You also have the option of 'Global'
which uses a global optimization algorithm. This option is more accurate, however the optimization process takes longer.
After training the model, we can use predict_resp function for performing the prediction on the new dataset.
[y_ S_] = predict_resp(MGP, X_);
Where X_
is the new dataset. Y_
is mean value of prediction and S_
is the uncertainty of the prediction.
Now let’s try to implement a simple example:
>> X = [1,2; 2, 1; 2, 3; 3, 4]
X =
1 2
2 1
2 3
3 4
>> Y = [3, 4; 2, 3; 1, 2; 4, 5]
Y =
3 4
2 3
1 2
4 5
>> MGP = MRSM (X, Y);
>> X_ = [2.5, 3.5]
X_ =
2.5000 3.5000
>> [Y_, S_] = predict_resp(MGP, X_)
Y_ =
2.2442 3.0326
S_ =
0.3581 0.4193
0.4193 1.0432
Y is the mean value of prediction for the two targets and S is the covariance matrix summarizing the uncertainty of prediction and also the correlation between the two targets at this specific test point. For instance this result shows that the correlation between the two targets at this point is 0.4193.