Open Xiaodonghai opened 4 years ago
The current code does seem to follow the original paper and not the correction:
# Compute the exponential and cosine terms
exp_term = (x1_exp_ - x2_exp_).pow_(2).mul_(-2 * math.pi ** 2)
cos_term = (x1_cos_ - x2_cos_).mul_(2 * math.pi)
res = exp_term.exp_() * cos_term.cos_()
# Product over dimensions
if last_dim_is_batch:
res = res.squeeze(-1)
else:
res = res.prod(-1)
# Sum over mixtures
mixture_weights = self.mixture_weights
@jacobrgardner @gpleiss
The current code does seem to follow the original paper and not the correction:
# Compute the exponential and cosine terms exp_term = (x1_exp_ - x2_exp_).pow_(2).mul_(-2 * math.pi ** 2) cos_term = (x1_cos_ - x2_cos_).mul_(2 * math.pi) res = exp_term.exp_() * cos_term.cos_() # Product over dimensions if last_dim_is_batch: res = res.squeeze(-1) else: res = res.prod(-1) # Sum over mixtures mixture_weights = self.mixture_weights
@jacobrgardner @gpleiss
Thank you for your reply.
In addition, I feel that the forward function seems to have some problems, but I'm not sure. Firstly, we can obtain self.mixture_scales = varz, which is the covariances of GMM.
> from sklearn.mixture import GaussianMixture
GMM = GaussianMixture(n_components=self.num_mixtures, covariance_type="diag").fit(inv_spec)
means = GMM.means_
varz = GMM.covariances_
weights = GMM.weights_
self.mixture_means = means
self.mixture_scales = varz
self.mixture_weights = weights
> ```
Then self.mixture_scales = varz is used in forward function as follows:
```python
> # Compute distances - scaled by appropriate parameters
x1_exp = x1_ * self.mixture_scales
x2_exp = x2_ * self.mixture_scales
x1_cos = x1_ * self.mixture_means
x2_cos = x2_ * self.mixture_means
# Create grids
x1_exp_, x2_exp_ = self._create_input_grid(x1_exp, x2_exp, last_dim_is_batch=last_dim_is_batch, **params)
x1_cos_, x2_cos_ = self._create_input_grid(x1_cos, x2_cos, last_dim_is_batch=last_dim_is_batch, **params)
# Compute the exponential and cosine terms
exp_term = (x1_exp_ - x2_exp_).pow_(2).mul_(-2 * math.pi ** 2)
cos_term = (x1_cos_ - x2_cos_).mul_(2 * math.pi)
res = exp_term.exp_() * cos_term.cos_()
> ```
I doubt whether
```python
> # Compute distances - scaled by appropriate parameters
x1_exp = x1_ * self.mixture_scales
x2_exp = x2_ * self.mixture_scales
> ```
should be
```python
> # Compute distances - scaled by appropriate parameters
x1_exp = x1_ * np.sqrt(self.mixture_scales)
x2_exp = x2_ * np.sqrt(self.mixture_scales)
> ```
, because their differences (x1_exp - x2_exp) will be squared.
Thanks for your kind attention and look forward your prompt reply.
I think in many cases it is more convenient and similarly effective to use a product of 1D spectral mixture kernels for a multi-D spectral mixture (as in the "SMP" kernel formulation of https://www.cs.cmu.edu/~andrewgw/manet.pdf), which I believe is what is implemented here, over using the result of inverse Fourier transforming a multidimensional mixture of Gaussians (which is what is presented in the typo correction note).
I think in many cases it is more convenient and similarly effective to use a product of 1D spectral mixture kernels for a multi-D spectral mixture (as in the "SMP" kernel formulation of https://www.cs.cmu.edu/~andrewgw/manet.pdf), which I believe is what is implemented here, over using the result of inverse Fourier transforming a multidimensional mixture of Gaussians (which is what is presented in the typo correction note).
I sincerely thank you for your reply and have benefited a lot. Another question here is whether
x1_exp = x1_ * self.mixture_scales
x2_exp = x2_ * self.mixture_scales
should be
x1_exp = x1_ * np.sqrt(self.mixture_scales)
x2_exp = x2_ * np.sqrt(self.mixture_scales)
in the forward()
function, because self.mixture_scales = varz = GMM.covariances_
and pow_(2)
is carried out as follows:
exp_term = (x1_exp_ - x2_exp_).pow_(2).mul_(-2 * math.pi ** 2)
Looking forward to your reply. Many thanks!
🐛 Bug
exp_term = (x1exp - x2exp).pow(2).mul(-2 * math.pi * 2) cos_term = (x1cos - x2cos).mul_(2 math.pi) res = expterm.exp() * costerm.cos()
For multi-dimensional inputs, according to the reference [1], it seems that _costerm should sum on dimensions first and then perform cosine transformation. [1] Andrew Gordon Wilson. 'Correction to Spectral Mixture (SM) Kernel Derivation for Multidimensional Inputs'. May 15, 2015. Available at http://www.cs.cmu.edu/~andrewgw/typo.pdf
Thanks for your kind attention and look forward your prompt reply.