mststats / VBLPCM

0 stars 0 forks source link

Question re estimation of cluster scale `sigma` and density parameter `beta` #1

Open tillahoffmann opened 2 years ago

tillahoffmann commented 2 years ago

Thanks for making the code for variational estimation available.

I am trying to infer the cluster scale sigma and density parameter beta for a single-cluster model. Based on the code and variational approximation in eq. (9) of the accompanying paper, I expected that the posterior approximation is

beta ~ normal(mean=V_xi_e, scale=sqrt(V_psi2_e))
sigma^2 ~ invchi2(dof=V_alpha) / inv_sigma02

where V_xi_e, V_psi2_e, V_alpha, and inv_sigma02 are attributes of the object returned by vblpcmfit. Apologies for the pseudocode; my R skills are virtually non-existent.

To test this hypothesis, I generated 50 realization of latent space networks with 100 nodes each as follows:

  1. sample density parameter beta ~ normal(0, 1)
  2. sample scale parameter sigma^2 ~ gamma(4, 1 / 4)
  3. sample latent positions z ~ normal(0, sigma^2)
  4. sample adjacency A_{ij} ~ bernoulli(expit(beta - d_{ij})), where d_{ij} is the distance between nodes i and j and expit is the logistic sigmoid.

I fitted the latent space models with G = 1 groups after rejecting graphs with number of edges less than the number of nodes.

density parameter beta

For the variational approximation of the density beta parameter, I relied on the following snippet and plotted the inferred beta against the beta used to generate the data (figure below).

https://github.com/mststats/VBLPCM/blob/a8c3289b36c3581b11d6f61ce81ec1c18c10c1b1/R/summary.vblpcm.R#L11

image

The inferred value seems to be consistently larger than the value used to generate the data.

scale parameter sigma

For the scale parameter, I used the following snippet and again plotted inferred values against values used to generate the data.

https://github.com/mststats/VBLPCM/blob/a8c3289b36c3581b11d6f61ce81ec1c18c10c1b1/R/plot_network.R#L92

image

There doesn't seem to be much correlation between the values inferred and the values used to generate the data here.

Do you know what might be going on here or where I'm misunderstanding?

mststats commented 2 years ago

Dear Till,

The method isn't really designed to work for a single cluster. More importantly, the variational approximation is one which favours speed over accuracy; sampling from the posterior distribution using MCMC is the better approach, however it scales very badly for network problems and can't handle even modest size data. Hance resorting to Variational Bayes, however this is known to yield approximations that are overconfident. It was designed for finding the clusters rather than estimating parameters of a single cluster.

In fact the beta plot is encouraging. There is a bias in that the method consistently overestimates the density, the slope is about 1. For 100 nodes, you should be able to also fit using latentnet (MCMC method), right? It will be slow, but doable. Does this also overestimate beta? Do the credible intervals usually enclose the values used to generate the data?

Does the sigma plot include the VB standard errors?

There is a possibility that rejecting networks where the #edges is less than #nodes creates a bias.

Kind regards,

Dr. Michael Salter-Townshend School of Mathematics and Statistics University College Dublin http://maths.ucd.ie/~mst/

On Sun, 4 Sept 2022 at 21:34, Till Hoffmann @.***> wrote:

Thanks for making the code for variational estimation available.

I am trying to infer the cluster scale sigma and density parameter beta for a single-cluster model. Based on the code and variational approximation in eq. (9) of the accompanying paper, I expected that the posterior approximation is

beta ~ normal(mean=V_xi_e, scale=sqrt(V_psi2_e)) sigma^2 ~ invchi2(dof=V_alpha) / inv_sigma02

where V_xi_e, V_psi2_e, V_alpha, and inv_sigma02 are attributes of the object returned by vblpcmfit. Apologies for the pseudocode; my R skills are virtually non-existent.

To test this hypothesis, I generated 50 realization of latent space networks with 100 nodes each as follows:

  1. sample density parameter beta ~ normal(0, 1)
  2. sample scale parameter sigma^2 ~ gamma(4, 1 / 4)
  3. sample latent positions z ~ normal(0, sigma^2)
  4. sample adjacency A{ij} ~ bernoulli(expit(beta - d{ij})), where d_{ij} is the distance between nodes i and j and expit is the logistic sigmoid.

I fitted the latent space models with G = 1 groups after rejecting graphs with number of edges less than the number of nodes. density parameter beta

For the variational approximation of the density beta parameter, I relied on the following snippet and plotted the inferred beta against the beta used to generate the data (figure below).

https://github.com/mststats/VBLPCM/blob/a8c3289b36c3581b11d6f61ce81ec1c18c10c1b1/R/summary.vblpcm.R#L11

[image: image] https://user-images.githubusercontent.com/966348/188331858-1b69362b-2ce8-416d-9476-86a019b4867b.png

The inferred value seems to be consistently larger than the value used to generate the data. scale parameter sigma

For the scale parameter, I used the following snippet and again plotted inferred values against values used to generate the data.

https://github.com/mststats/VBLPCM/blob/a8c3289b36c3581b11d6f61ce81ec1c18c10c1b1/R/plot_network.R#L92

[image: image] https://user-images.githubusercontent.com/966348/188332274-a2b508ce-2930-44f8-a643-029e9d510e68.png

There doesn't seem to be much correlation between the values inferred and the values used to generate the data here.

Do you know what might be going on here or where I'm misunderstanding?

— Reply to this email directly, view it on GitHub https://github.com/mststats/VBLPCM/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2HQG63ZCDBZUWXC3IO63TV4UBUFANCNFSM6AAAAAAQEPAVLE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tillahoffmann commented 2 years ago

Thank you for the quick response. I'll have a look at fitting with an MCMC sampler, e.g. stan or the latentnet package you mentioned.

It was designed for finding the clusters rather than estimating parameters of a single cluster.

I've also been experimenting with VB for inferring the parameters of planted partition models (based on https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.100.258701) and had similar challenges. It seems that, as you suggested, the cluster assignments are just "too overconfident" to infer the parameters.

Does the sigma plot include the VB standard errors?

I didn't add the error bars for the sigma plot, apologies. Will try to update over the coming days.

There is a possibility that rejecting networks where the #edges is less than #nodes creates a bias.

Yes, good point. I'll have a look at priors that don't require a reject-step.

Thanks again for the feedback. I'll report back once I've got MCMC samples.