SciML / QuasiMonteCarlo.jl

Lightweight and easy generation of quasi-Monte Carlo sequences with a ton of different methods on one API for easy parameter exploration in scientific machine learning (SciML)
https://docs.sciml.ai/QuasiMonteCarlo/stable/
MIT License
101 stars 25 forks source link

QMC vs Randomized QMC #17

Closed dmetivie closed 1 year ago

dmetivie commented 2 years ago

Hello, I was wondering if I was misunderstanding some things about this package: 1) It gathers different option to fill d-dimensional intervals with sequences. 2) Quasi Monte Carlo uses deterministic sequences. 2) To me the term sample expect some kind of randomness. Hence, it must refer to Randomized Quasi Monte Carlo. Indeed, for some sequences e.g. GridSample, LatticeRuleSample this is what happens. For example the LatticeRuleSample are generated and then shifted randomly. However, some sequences are generated deterministically, e.g. SobolSample, LowDiscrepancySample using the same sample function.

Am I missing something, or the sample function is used both for deterministic Quasi Monte Carlo sequences and random Quasi Monte Carlo sequences?

ChrisRackauckas commented 2 years ago

It's used for both depending on the sampling method.

dmetivie commented 2 years ago

Ok so I got it right. In the current package one cannot get the deterministic version of LatticRule or Grid and cannot get the randomized version of some other sequence. Moreover, the way to randomized might not be unique from what I get from the literature. Could we have something like generate(n, SequenceMethod) for the deterministic version (for the sequence for which it makes sense) and sampling(n, SequenceMethod, ShiftingMethod) for the randomized version?

ChrisRackauckas commented 2 years ago

That API suggestion doesn't consider the second user problem. Think for example NeuralPDE.jl: it would have to wrap both calls and feed it appropriately. The much better solution to the second user problem is to just use the flexibility of the types. So for example, we have generate(n, GridSample()), we can also do generate(n, GridSample(determinsitic = true)), or generate(m, LatticeRuleSample(shift = ShiftMethod())). This would make it so that it's still one function call and single types signify the full algorithm, but it would have the flexibility you're interested in.

dmetivie commented 2 years ago

That makes sense indeed to only use generate (or sample as it is now) with an option to shift or not.

ParadaCarleton commented 1 year ago

Hello, I was wondering if I was misunderstanding some things about this package:

  1. It gathers different option to fill d-dimensional intervals with sequences.
  2. Quasi Monte Carlo uses deterministic sequences.
  3. To me the term sample expect some kind of randomness. Hence, it must refer to Randomized Quasi Monte Carlo. Indeed, for some sequences e.g. GridSample, LatticeRuleSample this is what happens. For example the LatticeRuleSample are generated and then shifted randomly. However, some sequences are generated deterministically, e.g. SobolSample, LowDiscrepancySample using the same sample function.

Am I missing something, or the sample function is used both for deterministic Quasi Monte Carlo sequences and random Quasi Monte Carlo sequences?

I'd like to say, I'm definitely very interested in seeing randomized QMC sequences added to this package! I saw you created a package for them, and I'd love to see a PR, or help with it if you'd like.

ChrisRackauckas commented 1 year ago

Yes, it would be nice to integrate them here so that they can be used downstream in cases like NeuralPDE.jl. @YichengDWu was asking about this kind of thing the other day.

dmetivie commented 1 year ago

Yes it would be nice to merge RandomizedQuasiMonteCarlo.jl into this one. I was waiting to have something more mature.

I think my initial question kind of remain on the interface. We want to

  1. sample a low discrepancy x = (x_1, ..., x_N) sequences via sample(Sampler, N)
  2. Most of the time we need to randomize these sequences r= (r_1, ..., r_N). This in most package is done at the same time as 1) because it is faster, and we just care about one sequence r. I often see it written like r = sample(Sampler, N, randomize = true)
  3. Sometimes we want to randomize the same point sequence x multiple times (r[1], r[2], ..., r[M]) (to get an error estimate).

In case 3), I believe it is more natural (also faster from my experience than multiple sample(Sampler, N, randomize = true)) to call something like randomize(method, x) or randomize!(method, x). method here contain information about x so that some computations are done once. This is true for scrambling.

A simple example is for shifting lattice.

  1. Generating the randomized lattice x = sample(lattice,n) is computationally expansive (it uses generating vector etc).
  2. Randomizing from x is easy: r = shift(x) where shift is just an addition with modulus +rand(dim).
  3. This should be longer r = sample(lattice,x, random = true)

In RandomizedQuasiMonteCarlo.jl, I implemented randomization in the spirit of 3). How would you prefer this to fit into this package? API Ideally, with option 2) and 3) on the table (I guess it depends on the user application).

ParadaCarleton commented 1 year ago

Yes it would be nice to merge RandomizedQuasiMonteCarlo.jl into this one. I was waiting to have something more mature.

I think my initial question kind of remain on the interface. We want to

  1. sample a low discrepancy x = (x_1, ..., x_N) sequences via sample(Sampler, N)
  2. Most of the time we need to randomize these sequences r= (r_1, ..., r_N). This in most package is done at the same time as 1) because it is faster, and we just care about one sequence r. I often see it written like r = sample(Sampler, N, randomize = true)
  3. Sometimes we want to randomize the same point sequence x multiple times (r[1], r[2], ..., r[M]) (to get an error estimate).

In case 3), I believe it is more natural (also faster from my experience than multiple sample(Sampler, N, randomize = true)) to call something like randomize(method, x) or randomize!(method, x). method here contain information about x so that some computations are done once. This is true for scrambling.

A simple example is for shifting lattice.

  1. Generating the randomized lattice x = sample(lattice,n) is computationally expansive (it uses generating vector etc).
  2. Randomizing from x is easy: r = shift(x) where shift is just an addition with modulus +rand(dim).
  3. This should be longer r = sample(lattice,x, random = true)

In RandomizedQuasiMonteCarlo.jl, I implemented randomization in the spirit of 3). How would you prefer this to fit into this package? API Ideally, with option 2) and 3) on the table (I guess it depends on the user application).

We're planning to to rework the API anyways, so I wouldn't worry about that. Feel free to rewrite the interface, if you'd like.

Ideally, I think we would have something like sample(n, dimension, Faure(k)) to generate n independent replications of dimension^k points, and similar for Sobol (which should generate nets of size 2^k).

dmetivie commented 1 year ago

What would be the best way to integrate Randomized methods to this package? Should I fork this package and "do some copy/paste" my code into a new src file (then deprecate RandomizedQuasiMonteCarlo.jl to only update in QuasiMonteCarlo.jl)? Or create a fork which basically call RandomizedQuasiMonteCarlo.jl (as the sample function is doing with Sobol.jl, LatticeRules.jl etc)?

ChrisRackauckas commented 1 year ago

Either way, though I think the packages are small enough that just adding it to the src file is a good way to do it.

ParadaCarleton commented 1 year ago

What would be the best way to integrate Randomized methods to this package? Should I fork this package and "do some copy/paste" my code into a new src file (then deprecate RandomizedQuasiMonteCarlo.jl to only update in QuasiMonteCarlo.jl)? Or create a fork which basically call RandomizedQuasiMonteCarlo.jl (as the sample function is doing with Sobol.jl, LatticeRules.jl etc)?

I think merging RQMC into QMC.jl is a good idea.