fouticus / nsfbm

Non-Uniform Sampling of Fixed Binary Matrices
GNU General Public License v3.0
2 stars 0 forks source link

A few remarks #1

Closed michalkvasnicka closed 4 years ago

michalkvasnicka commented 4 years ago

Hi Alex, I read with great interest your paper. This is, after 7 years, a new paper which is devoted to the non-uniform sampling of binary matrix with fixed margins. I have a few remarks and/or questions:

  1. Did you made any quantitative comparison of your algorithms with Harrison's algorithm? I am using Harrison's algorithm a last few years for robust and effective non-uniform permutation sampling at EDA (Estimation of Distribution) permutation optimization class of problems, where weight matrix plays a role of suitable permutation probability distribution. I think that any kind of comparison with Harrison's algorithm should be a key part of your research in this domain.

  2. Moreover, with Harrison's algorithm is possible to update the weight matrix by so far "best" set of solutions very simply and correctly, via kth unnormalized importance weight ~ exp(logP(k)-logQ(k)). In your case I am not sure how to do that update, because you do not compute probabilities P an Q.

  3. What looks really strange is the fact, that both of your sampling algorithms completely ignore predefined binary matrix fixed margins due to the sampling process. Does it mean, that your algorithms preserve fixed margins of binary matrix automatically?

  4. How difficult could be the conversion your R-code to the MATLAB code? I am not familiar with R, but the MATLAB implementation of Harrison's algorithm is really very fast and effective. I will be very happy if will be available MATLAB code of your algorithms, too.

  5. I would like to test thoroughly your sampling algorithms but it is not straightforward for me in a case of R-code, which is a bit pity.

Finally, I would like to discuss with you any practical aspects of your new sampling algorithms, because I am using extensively these kind of algorithms in heuristic combinatorial optimization algorithms.

Michal

fouticus commented 4 years ago

Hi @michalkvasnicka,

Sorry for the tardy response.

  1. Thanks for showing interest in our work! We did not make any comparison with Harrison's & Miller's algorithm. To some extent this is implementation and system dependent, and the code we used is geared more towards clear illustration than being performant. The comparison may also depend on the problem setting. I would guess there are very large and possibly sparse matrices where swap-based methods perform better than the dynamic programming approach, but that for smaller matrices with more regular margins, H&M would perform much better. One appeal of our approach is that we sample directly from the target distribution rather than a proposal distribution, and we also don't have any recursion. To summarize, we have not yet done a detailed comparison with Harrison & Miller's algorithm, but suspect there may be some situation dependent trade-offs between the two methods.

  2. I think I don't understand your point here. I do not see where H&M update the weight matrix, except in the pre-processing step. Is that what you are referring to?

  3. Yes, the swap operation and curveball trade are both invariant with respect to the marginal sums, so we are guaranteed to preserve the marginal sums of the matrix.

  4. In principle the algorithm could be ported to MATLAB, unfortunately the last time I worked with MATLAB consistently was 6 years ago, so I'm not sure I will be able to port the algorithms there any time soon. In the other repo for this project, I have a version written in a mix of R and C, so that may be a better starting point if you know C.

  5. Sorry about this. I encourage you to examine the algorithms listed in the paper and see if you can implement them in your language of choice. One (subjective) advantage of our method is that its fundamental components (swap and curveball trade in this case) are conceptually simple. Our hope is that this makes the method more accessible to folks who may not have the requisite mathematical background to work with, for example, Harrison & Miller's algorithm.

If you have any other comments or suggestions, feel free to respond here or email me!

Alex.