whedon commented 7 months ago

Submitting author: @nikola-sur (Nikola Surjanovic) Repository: https://github.com/Julia-Tempering/Pigeons-Paper Editor: @pitsianis Reviewers: @nsailor, @georgebisbas



Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below.

@nsailor & @georgebisbas, please carry out your review in this issue by updating the checklist below.

nsailor commented 6 months ago

This work is an implementation of the Parallel Tempering method in Julia, an improvement over MCMC for obtaining samples form otherwise difficult to sample probability distributions. The package supports parallelization using both Julia's built-in multi-threading features, as well as MPI.jl, a Julia wrapper for MPI.

I greatly appreciate the attention given to what the authors term "strong parallelism invariance" (SPI), roughly meaning that the result should not depend on the degree of parallelism (especially considering the complexities of finite-precision floating-point arithmetic). This package also builds upon a Julia translation of a Java library called Splittable Random previously published by the authors, allowing deterministic random number generation across multiple threads.

Overall, I think this work is a great contribution to the Julia ecosystem, notwithstanding the following minor points:

In writing the above, it is possible that I have misunderstood some part of this work, in which case, please feel free to correct me.

nikola-sur commented 6 months ago

Thank you for your comments! We will work on updating the manuscript, package, and documentation to account for your feedback.

miguelbiron commented 5 months ago

Dear @pitsianis -- we were wondering what the expected timeline for the reviews is? Should we perhaps just respond to @nsailor and not wait for @georgebisbas, or should we wait until we have both sets of comments? Thank you.

pitsianis commented 5 months ago

Please answer any outstanding issues; you don't need to do this sequentially. This exchange will also remind @georgebisbas to wrap up his review.

miguelbiron commented 5 months ago

Thanks for the clarification.

georgebisbas commented 5 months ago

Dear @miguelbiron and @pitsianis, my apologies for the delay here. Feel free to respond to the outstanding issues; I aim to prioritize this review in my task list.

miguelbiron commented 5 months ago

No worries @georgebisbas, thank you for prioritizing

georgebisbas commented 4 months ago

Hi again, apologies for the delay, I have partly written a draft, aiming to complete my review in the next days.

georgebisbas commented 4 months ago

First, I would like to thank the authors for their patience over the last few months.

Pigeons.jl offers a high-level API to leverage shared- and distributed-memory parallelism via Julia's built-in multi-threading features and MPI.jl, a Julia wrapper for MPI. Overall, this is a great work suitable for JuliaCon proceedings. My review below lists a few weaknesses/questions that could help clarify my understanding of the work in a few places or act as constructive feedback to improve this work.

Strengths :

Weaknesses/Questions: I understand and like that it "works out of the box." It feels that authors focus more on correctness rather than performance. A few questions I have, are:


nikola-sur commented 4 months ago

Thank you to both reviewers for their comments! Now that all reviews are in, we will get back to you shortly with updates and responses to the questions raised.

nikola-sur commented 4 weeks ago

We thank the reviewers for their insightful comments! Our responses to each reviewer are presented below.

Reviewer 1 (Jasan Barmparesos @nsailor) It would be very interesting to see the scaling characteristics of this implementation, especially with respect to the other algorithm parameters, for instance, the number of chains.

Additionally, the paper mentions Pigeons being able to run on "thousands of MPI-communicating machines". It may be worth clarifying if this is something that has been tested or a possibility given the package's design.

I was not able to get a speedup with the toy MVN example using multiple threads (see this issue in the project repository). In general, it would be very helpful to have some hints in the package's documentation for tuning the degree of parallelism for a given problem, especially if the speedup from parallelism is not linear.

It would be great to have additional examples, as described in this issue

Reviewer 2 (George Bisbas @georgebisbas)

What are the memory requirements of a "typical large enough" problem to be tackled?

What are the advantages of DMP versus SMP only?

Why are any strong scaling graphs not included?

After reading the paper, I feel that MPI works correctly, but with no clue on why it is needed and what performance benefits it brings to the table.

Have any of the additional targets been implemented in the meantime? Could the paper be updated there?

References Syed, S., Bouchard-Côté, A., Deligiannidis, G., & Doucet, A. (2022). Non-reversible parallel tempering: a scalable highly parallel MCMC scheme. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2), 321-350.

