stan-dev / docs

Documentation for the Stan language and CmdStan
https://mc-stan.org/docs/
Other
38 stars 108 forks source link

[FR] Adding Gaussian Process Latent Variable Model Examples #442

Open InfProbSciX opened 2 years ago

InfProbSciX commented 2 years ago

Summary:

I'd like to add examples of GPLVMs to the docs. A classical GPLVM (as described in Lawrence, N. D. 2003, Gaussian process latent variable models for visualisation of high dimensional data) should be quite easy to code up based on existing GP examples in Stan docs.

I've provided an example of a GPLVM model with inducing points below (using the collapsed bound, where the inducing variables are integrated out). I've coded up the lower bound that doesn't include variational parameters below (so when used with mean-field VI, the objective becomes the standard ELBO).

functions {
    real lower_bound(
            matrix K_fu,
            matrix K_uu,
            matrix Y,
            real noise_inv,
            real k_diag
        ) {
        int n = dims(Y)[1];
        int d = dims(Y)[2];
        int m = dims(K_uu)[1];

        real elbo;
        real eta;

        matrix[m, m] phi;
        matrix[n, m] psi;
        matrix[d, d] W;

        matrix[m, m] jit = diag_matrix(rep_vector(1.0, m)) * 1e-6;

        phi = K_fu' * K_fu;
        psi = K_fu;
        eta = n * k_diag;

        // Based on eqn 3.25 Damianou, Andreas (2015),
        // Deep Gaussian Processes and Variational Propagation of Uncertainty,
        // PhD thesis, University of Sheffield.

        W = (noise_inv^2) * (Y' * psi) * inverse(noise_inv*phi + K_uu + jit) * transpose(Y' * psi);

        elbo  = -0.5 * (noise_inv*trace(Y' * Y) - trace(W)) / d;
        elbo += log(noise_inv)*n*0.5 + log(determinant(K_uu + jit))*0.5;
        elbo -= log(2 * pi()) * (n/2.0);
        elbo -= log(determinant(K_uu + jit + phi*noise_inv)) / 2.0;
        elbo -= noise_inv * eta / 2.0;
        elbo += noise_inv * trace(mdivide_left_spd(K_uu + jit, phi)) / 2.0;
        return elbo * d;
    }
}
data {
    int n; int d; int q; int m;
    matrix[n, d] Y;
    vector[q] Z[m];
}
parameters {
    vector[q] X[n];
    real<lower=1e-6> prior_var;
    real<lower=1e-6> noise_var;
    real<lower=1e-6> funcn_var;
}
model {
    matrix[m, m] K_uu = gp_exp_quad_cov(Z, funcn_var, 1.0);
    matrix[n, m] K_fu = gp_exp_quad_cov(X, Z, funcn_var, 1.0);
    real k_diag = K_uu[1, 1];

    for(i in 1:q) { X[, i] ~ normal(0, sqrt(prior_var)); }
    target += lower_bound(K_fu, K_uu, Y, 1/noise_var, k_diag);
}

The data can be obtained as:

wget http://staffwww.dcs.shef.ac.uk/people/N.Lawrence/resources/3PhData.tar.gz
tar xvf 3PhData.tar.gz
rm DataTst* DataVdn* 3PhData.tar.gz

The script to run it:


library(ggplot2)
library(cmdstanr)
library(data.table)

set.seed(42)

Y = as.matrix(fread('DataTrn.txt'))
labels = as.character(as.matrix(fread('DataTrnLbls.txt')) %*% 1:3)

n = nrow(Y); q=2; d=ncol(Y); m=36
X = matrix(rnorm(n*q), n, q)
Z = matrix(rnorm(m*q), m, q) # inducing inputs

model = cmdstan_model('model.stan', cpp_options=list(stan_opencl=TRUE))

results = model$variational(
    data=list(n=n, q=q, m=m, d=d, Y=Y, Z=Z),
    seed=42,
    iter=2000,
    opencl_ids=c(0, 0),
    algorithm='meanfield',
    eta=1,
    adapt_engaged=F
)

X_recon = results$summary('X')[c('variable', 'mean')]
X_recon = matrix(X_recon$mean, n, q, byrow=F)

ggplot(data=data.frame(x=X_recon[, 1], y=X_recon[, 2], lb=labels)) +
    geom_point(aes(x=x, y=y, color=lb), alpha=0.2)

The latent variables should be as follows.

Screenshot 2021-11-21 at 16 01 12
spinkney commented 2 years ago

This is an interesting model. Can you make an argument for this being in the docs vs a case study?