UCL-SML / Doubly-Stochastic-DGP

Deep Gaussian Processes with Doubly Stochastic Variational Inference
Apache License 2.0
142 stars 48 forks source link

Handling a stepping up dimensions DGP network #21

Open Hebbalali opened 6 years ago

Hebbalali commented 6 years ago

Dear Hugh, Firstly i would like to thank you for your paper "Doubly Stochastic Variational Inference for DGP" and sharing its implementation online. I started working with your implementation recently, and the new modifications with the implementation of the natural gradients are very interesting. However, i have an issue for some configurations of the Deep Gaussian network. In fact, when setting the number of units in a hidden layer to a number diffrent from the input dimension, the model can not be instantiated. For example for a regression problem with an input space dimension of 2 the following code does not work:

kernels = [RBF(2, ARD=True), RBF(3, ARD=True), RBF(3, ARD=True), RBF(2, ARD=True)] model = DGP(X, Y, X, kernels, Gaussian_lik(), num_samples=100)

with the following error:

assertion failed: [] [Condition x == y did not hold element-wise:] [x (autoflow/RBF/compute_K_symm_40/strided_slice_1:0) = ] [2] [y (autoflow/RBF/compute_K_symm_40/Const:0) = ] [3]

By looking back to your paper the initialization of the mean functions of the hidden layers is based on the SVD decomposition of the data and by using the top dim_output eigenvectors to initialize W. However, this works only if dim_input > dim_output at layer l. Hence, limiting the structure of the network to a decreasing number of units along the layers. This limits the structure of the DGP, where a greater number of hidden units may discover some other features. So it may b interesting to consider the case where dim_output> dim_input in the initialization for example by using an augmented matrix of the data.

Sincerly,

Ali Hebbal

hughsalimbeni commented 6 years ago

Thanks for your question. The current code doesn't support stepping up dimensions, but this can be added fairly easily. One reason this got dropped on the recent upgrade (the old code did support this though I never used it) is that it isn't totally clear to me the best way to initialize the extra dimensions. Before it was just zeros (for the mean functions, variational parameters, and inducing points), but maybe this isn't the only sensible choice.

I will do this later today with zeros padding all additional dimensions, unless anyone has any alternative ideas. NB experiments stepping up dimensions have not been published as far as I know, so I make no claims as to what might happen.

hughsalimbeni commented 6 years ago

Perhaps @kurtCutajar will correct me on that last point, on second thoughts

hughsalimbeni commented 6 years ago

@Hebbalali I've added the extra lines and merged. Let me know if that works for you

Hebbalali commented 6 years ago

Thank you @hughsalimbeni for your quick response. I will test this new implementation, and i'll keep you informed.