HydrologicEngineeringCenter / HEC-FDA

Hydrologic Engineering Center - Flood Damage Analysis
MIT License
11 stars 2 forks source link

Possible monte carlo optimization (Latin Hypercube Sampling) #749

Open AlexRyanUSACE opened 1 year ago

AlexRyanUSACE commented 1 year ago

It may be worthwhile to investigate using latin hypercube sampling instead of the standard purely random sampling method. It looks as though all of the tools are already in place to do this. The simulation compute logic would need a method to subdivide the CDFs of all uncertainpaireddata objects into n equal size/probability partitions and then to generate random paired partition arrays for each iteration. The UncertainPairedData class could have a seperate SamplePairedData function to accept a partition as a parameter and pull random values for that partition (or more likely an array of partitions for every x?). The difficult thing would probably be to work out the best way to marry up all the partitioning in an efficient way. This may significantly cut down on the number of iterations necessary to converge by ensuring the full range of all distributions are hit in the first chunk of iterations performed, though this would come at the fixed cost of setup time.

HenryGeorgist commented 1 year ago

all that needs to happen to test that is to implement an IProvideRandomNumbers instance, https://github.com/HydrologicEngineeringCenter/HEC-FDA/blob/main/fda-model/interfaces/IProvideRandomNumbers.cs

if you can find a third party (or roll your own) the instance provided to the aggregated stage damage function method would just be replaced in a test.

Since you are putting in a low number of iterations, it is unlikely that you would see change in performance directly, this is something that would reduce overall iterations. So the comparison would be in how many iterations it takes to reach convergence, if we are hitting max with and without latin hypercube it is not a valid test.

HenryGeorgist commented 1 year ago

i googled a .net latinhypercube rng, but couldnt find one easily, i wonder if haden or woody have developed one.

HenryGeorgist commented 1 year ago

public LHC_PRN(int countofstrata, double seed){
//store those two doubles. instantiate a standard random number generator.
}
public NextRandom(){
double strata = math.floor(rand.nextdouble()*countofstrata)
double strataweight = 1/countofstrata //make sure this is a double
double startOfStrata = 1/strata //make sure this is double
double positionInStrata = rand.nextdouble()*strataweight
return startOfStrata + positionInStrata //make sure im not off by 1 in the start of strata
}
HenryGeorgist commented 1 year ago

it would be slower than a standard random number generator by roughly half, so it would need to cut our sample size by more than 50 percent to be worthwhile, so finding a proper count of strata is a bit of a goldilocks problem