Open wim-cpp opened 1 year ago
I'm not sure if I follow the issue, could you elaborate? argsort gives us the sorted indices of the realizations for each dimension. The benchmarks don't show samples being far from uniformly distributed. Could you show me an MWE of where the samples are not uniform?
The paper by Deutsch & Deutsch is not very explicit and just states “realizations are ranked”. See my example above of how a list of numbers should be ranked, or see this wikipedia entry about ranking. Just generate (e.g. two-dimensional) examples using both argsort and proper ranking and you’ll see te difference.
https://github.com/sahilm89/lhsmdu/blob/925efa88e1dd87d02b6dbe0d27c64b53e8dd98e5/lhsmdu/__init__.py#L81
Resampling should be done by determining (per dimension) the index of a realization after sorting all realizations in that dimension. However, this is not what
argsort
does. For example[0.62, 0.19, 0.92, 0.22, 0.07]
should result in[3, 1, 4, 2, 0]
, butargsort
gives[4, 1, 3, 0, 2]
instead. Apply argsort again onsortedIndicesOfStrata
to get the correct vector of indices. In this exampleargsort([4, 1, 3, 0, 2]])
gives the correct result[3, 1, 4, 2, 0]
.The effect of this incorrect resampling is that the resulting Latin hypercube samples are far from uniformly distributed, as intended by LHSMDU.