guilgautier / DPPy

Python toolbox for sampling Determinantal Point Processes
https://dppy.readthedocs.io
MIT License
219 stars 52 forks source link

Numerical stability of MCMC for k-DPP #58

Closed DavidBurt2 closed 4 years ago

DavidBurt2 commented 4 years ago

There appears to be issues with performing approximate sampling using MCMC to sample from a k-DPP when using a likelihood matrix with eigenvalues less than 1 if k is large.

Sample code """ import numpy as np from numpy.random import RandomState from dppy.finite_dpps import FiniteDPP rng = np.random.RandomState(413121) r, N = 5, 200 k = 30

Phi = rng.randn(r, N) L = Phi.T.dot(Phi)+ 1e-1* np.eye(N) # add a multiple of the identity matrix to make L full rank print(np.linalg.matrix_rank(L)) # rank 200 DPP = FiniteDPP('likelihood', **{'L': L}) DPP.sample_mcmc_k_dpp(size=k, random_state=rng) print(DPP.list_of_samples) """

Error Message ValueError: Initialization problem, you may be using a size k > rank of the kernel

Expected behavior As this matrix is reasonably well-conditioned, I would expect samples can be drawn in a numerically stable way.

Additional context It seems there is a check at initialization that the matrix is rank 30, which involves directly computing determinants of submatrices of size kxk. I suspect this leads to underflow. Perhaps this can be avoided? In the exchange step, identities for the determinant of block matrices can be used which I suspect avoid some of these issues. Although I am not sure they provide an effective way to check that the matrix is at least rank k.

guilgautier commented 4 years ago

Hi @DavidBurt2,

Thanks for playing with DPPy and raising this issue!

Let me first modify slightly the code you are using to print the condition number of the matrix

import numpy as np
from dppy.finite_dpps import FiniteDPP

rng = np.random.RandomState(413121)
r, N = 5, 200
k = 30

# Random feature vectors
Phi = rng.randn(r, N)
L = Phi.T.dot(Phi) + 1e-1 * np.eye(N) # add an identity matrix to make L full rank
print('rank', np.linalg.matrix_rank(L))

L_e_vals = np.linalg.eigvalsh(L)
cond_number = max(L_e_vals) / min(L_e_vals)
print('condition number', cond_number)

DPP = FiniteDPP('likelihood', L=L)
DPP.sample_mcmc_k_dpp(size=k, random_state=rng)
print(DPP.list_of_samples)

Expected behavior As this matrix is reasonably well-conditioned, I would expect samples can be drawn in a numerically stable way.

rank 200
condition number 2296.366647336234

ValueError: Initialization problem, you may be using a size k > rank of the kernel

I don't know if 2 * 10^3 characterizes a "reasonably well-conditioned matrix" but the important thing is that the kxkminors of L have (very) small determinant.

Additional context It seems there is a check at initialization that the matrix is rank 30, which involves directly computing determinants of submatrices of size kxk. I suspect this leads to underflow. Perhaps this can be avoided? In the exchange step, identities for the determinant of block matrices can be used which I suspect avoid some of these issues. Although I am not sure they provide an effective way to check that the matrix is at least rank k.

Actually, there is no check of the rank of L, the message is not very explicit sorry for that!

An error is thrown because the initialization failed with initialize_AD_and_E_sampler: None of the 100 random k-subsets S0 satisfied the (hard-coded constraint) det L_S0 > 1e-9

The choice of the initial set and the computation of the acceptance ratio are indeed the main issues with these Markov chains.

The simplest way to make things work is to provide your own initial k-subset s_init=S0 and number of steps of the exchange Markov chain nb_iter=??

import numpy as np
from dppy.finite_dpps import FiniteDPP

rng = np.random.RandomState(413121)
r, N = 5, 200
k = 30

# Random feature vectors
Phi = rng.randn(r, N)
L = Phi.T.dot(Phi) + 1e-1 * np.eye(N) # add an identity matrix to make L full rank

S0 = rng.choice(N, size=k, replace=False)
print('det L_S0 = ', np.linalg.det(L[np.ix_(S0, S0)]))

DPP = FiniteDPP('likelihood', L=L)
DPP.sample_mcmc_k_dpp(size=k, random_state=rng, s_init=S0, nb_iter=3)

print(DPP.list_of_samples, sep='\n\n')
det L_S0 =  5.561290886399235e-19
[[[37, 74, 182, 104, 83, 33, 157, 16, 15, 106, 173, 127, 69, 184, 61, 53, 11, 138, 170, 128, 196, 166, 139, 96, 26, 39, 36, 3, 103, 85], [37, 74, 182, 104, 83, 33, 157, 16, 15, 106, 173, 127, 69, 184, 61, 53, 11, 138, 170, 128, 196, 166, 139, 96, 26, 39, 36, 3, 103, 85], [37, 74, 182, 104, 83, 33, 157, 16, 15, 106, 173, 127, 69, 184, 61, 53, 11, 138, 170, 128, 196, 166, 139, 96, 26, 39, 36, 3, 103, 85]]]

Note that the likelihood of S0 under k-DPP(L) is SUPER small!

@DavidBurt2 feel free to contribute to DPPy through a Pull Request 🙂


TODO