guilgautier / DPPy

Python toolbox for sampling Determinantal Point Processes
https://dppy.readthedocs.io
MIT License
219 stars 52 forks source link

Non-symmetric DPP sampler #67

Closed ngocbh closed 2 years ago

ngocbh commented 2 years ago

I'm finding a sampling method that can simulate attractive behaviors of given points (finite) rather than repulsive behaviors as in symmetric DPP. I found some papers studying how to learn a non-symmetric kernel of DPP that can model both repulsive and attractive behaviors.

It seems that L need not be symmetric, provided that it is a P0-matrix. Furthermore, if L is K_ij * K_ji < 0, then item i and item j attract each other ( P({i, j} \in Y) > P(i \in Y)P(j \in Y) ).

How can I implement DPP(L) with L is a skew-symmetric matrix? It seems that I cannot use eigendecomposition since the eigenvalues of a real skew-symmetric matrix are purely imaginary, Could I use the MCMC instead?

guilgautier commented 2 years ago

Hi @ngocbh

How can I implement DPP(L) with L is a skew-symmetric matrix?

You can use the relation K = L (L + I)^-1 = I − (L + I)^−1 and create the corresponding DPP(K)

dpp = FiniteDPP("correlation", projection=False, K=K)

Could I use the MCMC instead?

Note that I'm currently working on a large refactoring of DPPy, which should simplify things when working with non-symmetric kernels.

ngocbh commented 2 years ago

@guilgautier I tried but I got an error:

/Users/ngocbh/.virtualenvs/alrc/lib/python3.9/site-packages/scipy/sparse/_index.py:82: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_intXint(row, col, x.flat[0])
Traceback (most recent call last):
  File "/Users/ngocbh/workspace/projects/vinai/Robust-Reweighting-LIME/utils/sampling.py", line 190, in <module>
    Z = dpp_sampling(g, None, 0.1, 5)
  File "/Users/ngocbh/workspace/projects/vinai/Robust-Reweighting-LIME/utils/sampling.py", line 172, in dpp_sampling
    dpp = FiniteDPP("correlation", projection=False, K=K)
  File "/Users/ngocbh/.virtualenvs/alrc/lib/python3.9/site-packages/dppy/finite_dpps.py", line 121, in __init__
    self.K = is_symmetric(params.get('K', None))
  File "/Users/ngocbh/.virtualenvs/alrc/lib/python3.9/site-packages/dppy/utils.py", line 133, in is_symmetric
    raise ValueError('array not symmetric: M.T != M')
ValueError: array not symmetric: M.T != M
guilgautier commented 2 years ago

For now, DPPy indeed allows to define a FiniteDPP only with symmetric kernels, K or L.

However, core sampling methods can be called to sample from non-symmetric DPPs

To generate exact samples, you can exploit the relation K = L (L + I)^-1 = I − (L + I)^−1 and use dppy.exact_sampling.dpp_sampler_generic_kernel(K) https://github.com/guilgautier/DPPy/blob/cb4577f75ca998481ca6f248af10b19f986eca1c/dppy/exact_sampling.py#L322

To generate approximate samples (using an MCMC method) have a look at dppy.mcmc_sampling.dpp_sampler_mcmc(L) https://github.com/guilgautier/DPPy/blob/cb4577f75ca998481ca6f248af10b19f986eca1c/dppy/mcmc_sampling.py#L24

If you provide a more concrete example (even a toy example), I might be able to help you further.

ngocbh commented 2 years ago

Hi @guilgautier, is it possible to use k-DPP or any others to sample from a conditional probability? For example, I have a set of items S. And I have already sampled a subset A \in S. How to sample a subset of k items from S\A so that it is diverse with respect to A (and k items are diverse also).

Thank you for answering my questions.

guilgautier commented 2 years ago

The class of DPPs is stable under some set operations and some conditioning, i.e., the resulting say conditioned process is still a DPP with a different kernel. See e.g.,