aneeshnaik / lintsampler

Efficient random sampling via linear interpolation.
https://lintsampler.readthedocs.io
MIT License
1 stars 1 forks source link

JOSS review: main theoretical issue #8

Open vankesteren opened 1 week ago

vankesteren commented 1 week ago

Dear authors,

Hereby an issue to track the main point raised in my review of the work you have submitted to JOSS. See the review thread here.

First of all, let me congratulate you on the excellent technical state of your package. The documentation, tests, code quality are all top-notch. However, in my opinion an important issue needs to be addressed before the work can be considered for publication in JOSS (note that the editor may disagree!)

The main issue I have is with the statement of need. It is unclear, both in your paper and in your documentation, when I would want / need to use lintsampler.

First, a better description of relevant real-world research situations is needed. It needs to be better than

For a small number of well-studied probability distributions, optimised algorithms exist to draw samples cheaply. However, one often wishes to draw samples from an arbitrary PDF for which no such algorithm is available.

As an external reader, I may disagree with the "a small number of well-studied probability distributions", as there is a large number of probability distributions, which cover a wide range of real-world situations from which sampling is relatively easy to do (e.g., your GMM example which is easy to sample from without lintsampler). What research problem that you envision does not fall within this set of distributions?

Second, and more pertinent: you do clearly contrast your method to more complex methods such as MCMC, but you fail to compare it to simpler methods like importance sampling. Currently, information is lacking: when is your method preferred over simpler methods? Does it scale well in higher dimensions? Is it computationally more efficient? What is the advantage?

# Example of importance sampling
propdist  = uniform(-12, 24) # or normal(0, 4) or student(2) or ...
proposals = propdist.rvs(1000)
weights   = pdf(proposals) # / propdist.pdf(proposals) not needed because uniform!
samples   = np.random.choice(proposals, 1000, p=weights / weights.sum())

In particular, I was curious about whether importance sampling could even be better than lintsampler for the same number of samples. The results from my tests — using the examples from your documentation — indeed seem to indicate that importance sampling with a uniform proposal distribution can perform better in those cases. The advantage will only grow with better proposal distributions. The comparison is available in this GitHub gist.

matt-graham commented 1 week ago

I agree with @vankesteren's comments here and would also like to see the statement of need clarified - I would also echo @vankesteren point that the technical quality of the package here is great, and it might be that the points were are raising here are out of scope for the JOSS review process.

In

We anticipate lintsampler finding use in many applications in scientific research and other areas underpinned by statistics. In such fields, pseudo-random sampling fulfils a myriad of purposes, such as Monte Carlo integration, Bayesian inference, or the generation of initial conditions for numerical simulations. The linear interpolant sampling algorithm underpinning lintsampler is a simple and effective alternative to existing techniques, and has no publicly available implementation at present.

you state that pseudo-random sampling can be used for purposes such as Monte Carlo integration or Bayesian inference. To my mind the latter is a specialised subset of the former - the end goal of sampling in Bayesian inference is typically to form Monte Carlo estimates of intractable integrals such as posterior expectations or model evidence terms. In the context of estimating integrals, the linear interpolant sampling algorithm implemented in lintsampler, as it relies on some form of tensor product grid on the space being integrated over, I believe will have a computational cost (in terms of the number of evaluations of the integrand / density function) which scales exponentially in the dimension of the space to achieve a given approximation error. This I think puts linear interpolant sampling in the same computational complexity bracket as numerical quadrature rules for multidimensional integrals based on tensor product grids. Given there are a wide variety of numerical quadrature rules available with well understood numerical convergence properties and a range of Python implementations available, my question would therefore be when / in what sort of problems should a user approximate an integral using linear interpolant sampling rather than a numerical quadrature method?