sahilm89 / lhsmdu

This is an implementation of Deutsch and Deutsch, "Latin hypercube sampling with multidimensional uniformity", Journal of Statistical Planning and Inference 142 (2012) , 763-772
MIT License
80 stars 16 forks source link

Incremental/Nested latin hypercube sampling #9

Open MatthiVH opened 4 years ago

MatthiVH commented 4 years ago

I had another question concerning the sampling. In your sample code, at a certain moment, you define resampling with the same strata of the previous sample to achieve an additional nested sampling.

I'm wondering. This function just generates a new list of sampling points within the same selection of latin hypercube-squares in the example (2 variables). Is there a way to create an incremental nested sampled set, so that the first time you sample, you get a set of lhs-mdu-sampled points (e.g. 20). And you do this then multiple time but the next 20 points take into account the position of the 20 previously sampled points to generate then 20 more lhs-mdu-sampled points and together you have 40, 60, 80 sampled points then.

In this way a single generated set of 40, 60, 80 lhs-mdu-sampled points should the be the same as an incremental nested set of 2,3,4 times 20 points but the advantage is that you can split up the calculations you have to run in steps of 20 for ex. untill you achieve convergence. So that you don't have to run the whole sampled set at once but can break it down and stop when convergence is achieved. Is there a way to do this with this package?

Attached is an example. Two sampled sets, generated with the same seed-number. In the first (blue) 120 samples are generated. In the second (red) 240 samples are generated but the second set includes the datapoints from the first set in such a way that if I select the first 120 sample-points from the red dataset, these are the same as the blue dataset. Is this something which is possible to generate with this package? Nested sample.pdf

sahilm89 commented 4 years ago

If I understand what you mean, you're saying that you can include the blue sampled points to generate new strata, which you can then use to generate the next set of samples. Is that correct? If yes, I don't have a way to do that in this package. Intuitively, I think the strata should not be updated with the samples, as that'll lead to biased sampling. The current algorithm has a two-step process: First, the strata are generated, and second, the sampling occurs over a grid spacing defined by the strata. The proper way to use would be to put in all the samples you need upfront. Because internally the algorithm generates a multiple (preset is 5 as per the Deutch and Deutsch paper) of the number of samples to generate the strata. However, if you think the math for nested sampling checks out, I'd love to know, and it's not hard to implement.

I put in the resampling just for the odd-case where there is a reason that the strata need to be reused. Such as 100 copies of a parameter sweep. Cheers, S

MatthiVH commented 4 years ago

Well I don't know. It's more an approach where the next lhs-sampled point checks the position of all previous point previously to calculating where it's position should be. In an incremental lhs, you can stop the sampling wherever you want and the generated sample will be always equally spaced as a lhs-sample should be. Usage of an incremental latin hypercube sample is for example if it is uncertain how many simulations can be completed within available time or to check when convergence is reached. It can be earlier than what papers state for example. It's helpful to save time and meet paper deadlines for conferences etc.

If the incremental lhs is increased by factor 2, the strata is also increased by factor 2 and then the previous strata is nested within the new strata I think? (A latin square divided by 20, that x and y-axis spacing is part of a latin square divided by 40). In the reference, there's a programming language that has this function.

https://dakota.sandia.gov/sites/default/files/docs/6.3/html-ref/method-sampling-sample_type-incremental_lhs.html