Closed neil-yc closed 4 years ago
Same issue here : di2 can become negative. Also, the stopping criteria does not make sense: di2<1. If all input vectors are normalized, then di2<1 is always the case. Seems like Algorithm 1 is broken. Could you post your code in the repository, so that we can find the issue ?
@Nicholasyc @bduclaux Sorry about the late reply.
On page 4, right below equation (9), we mentioned that
The stopping criteria is di2<1 for unconstraint MAP inference or #Yg>N when the cardinality constraint is imposed. For the latter case, we introduce a small number epsilon>0 and add di2<epsilon to the stopping criteria for numerical stability.
If the input vectors are normalized, then di2<=1 is always the case. But in some cases, for example in equation (11), if we run Algorithm 1 directly on L, then di2 can be larger than 1.
In theory, di2 cannot become negative if L is positive semidefinite. It happens in practice because of numerical instability. That's why we add di2<epsilon to the stopping criteria as mentioned above. Please refer to Supplementary Material section B for more discussions on numerical stability.
@laming-chen Thanks a lot ! Very clear !
Supplementary material also states a very important theorem: when the kernel L is a low rank matrix, Algorithm 1 of the paper will return at most rank(L) items. It means that numerical instability will come early in the calculations.
@bduclaux Thanks for the comments. I have uploaded the code of algorithm 1.
@laming-chen thanks for your post, however i have got in the same situation with @bduclaux , numerical come early in the calculation, and less items than max_length return in one iteration, so i will execute more than one iteration but i may not make sense.
And the reason i think may be the difference between our similaritities, so is there any specific point for the construnction of similarity matrix and score matrix (values, scales ....) ?
Thanks for your reply.
@Nicholasyc I think there are some things we can do:
@Nicholasyc I think this is a good topic to discuss. Would you please translate the title and your original post to English? I hope the discussions can be helpful to others.
@laming-chen thanks for your reply, so if u mean that the reason why i can not get k items as expected is the rank of similarity matrix is less than k ? If so, I am a little confused about the construction of similarity matrix. because sometimes we can not get similarity matrix always with feature vector (e.g. with pearson correlation or any other method), and then the rank of similarity may be not guaranteed ( but the dimension of similarity can be guaranteed to be N >> k)
i have translate the title and original post to english, thanks !
@Nicholasyc The rank of similarity matrix could be a reason. I think constructing the similarity matrix with feature vectors is probably the easiest way to guarantee that the similarity matrix is positive semidefinite (PSD). If the similarity matrix is not PSD, then it's no longer a DPP.
@laming-chen thanks for your reply. according to the paper, to be pricise, the kernel matrix which is constructed with quality and similarity should be PSD rather than the similarity matrix ? and if the rank of similarity matrix indeed the reason. should we need to pay for the checkout before we do the follow-up steps ? i wonder if there is a easy way to get a PSD matrix with similarity and quality calculated in black box.
@Nicholasyc It is easy to prove that the kernel matrix is PSD if and only if the similarity matrix is PSD.
To verify if a matrix is PSD or not, we can follow two steps:
@laming-chen I've tried to verify eigenvalues with np.linalg.eig(matrix)
to test whether all of eigenvalues of this matrix are non-negative, and if one eigenvalue is negative, i will replace it with 0, and finally construct a new PDS matrix (according to Practical Diversified Recommendations on YouTube with
Determinantal Point Processes in section 4.2). However, it will cost time a little unacceptable.
hi, I have some question about the code in algorithm 1 considering
di^2=Lii
, the value in diagonal of the kernel L is some thing like CTR ? if so,di^2
should always be less than 1 which is described in stop criteriadj^2 < 1
and di^2 can be negative in my practice and then items less than N(e.g. 20) can be return, so how should i do to address this problem
thanks : )